LSI mpt(4) driver problem : can't SMART poll, controller freezes
Douglas Gilbert
dgilbert at interlog.com
Sat Oct 27 17:12:19 UTC 2012
On 12-10-27 12:10 AM, Stephane LAPIE wrote:
> Hello list,
>
> I have two controller cards of the following make (PCI-X controllers) :
> Oct 24 09:26:00 eirei-no-za kernel: mpt0: <LSILogic SAS/SATA Adapter>
> port 0x2000-0x20ff mem 0xdfa20000-0xdfa23fff,0xdfa00000-0xdfa0ffff irq
> 24 at device 1.0 on pci6
> Oct 24 09:26:00 eirei-no-za kernel: mpt0: MPI Version=1.5.12.0
> Oct 24 09:26:00 eirei-no-za kernel: mpt0: Capabilities: ( RAID-0 RAID-1E
> RAID-1 )
> Oct 24 09:26:00 eirei-no-za kernel: mpt0: 0 Active Volumes (2 Max)
> Oct 24 09:26:00 eirei-no-za kernel: mpt0: 0 Hidden Drive Members (10 Max)
>
> Oct 24 09:26:00 eirei-no-za kernel: mpt1: <LSILogic SAS/SATA Adapter>
> port 0x2400-0x24ff mem 0xdfa24000-0xdfa27fff,0xdfa10000-0xdfa1ffff irq
> 28 at device 7.0 on pci6
> Oct 24 09:26:00 eirei-no-za kernel: mpt1: MPI Version=1.5.12.0
> Oct 24 09:26:00 eirei-no-za kernel: mpt1: Capabilities: ( RAID-0 RAID-1E
> RAID-1 )
> Oct 24 09:26:00 eirei-no-za kernel: mpt1: 0 Active Volumes (2 Max)
> Oct 24 09:26:00 eirei-no-za kernel: mpt1: 0 Hidden Drive Members (10 Max)
>
> Each of them having 8 ports used in the following fashion :
> <ATA ST32000641AS CC13> at scbus0 target 0 lun 0 (pass0,da0)
> <ATA ST32000542AS CC37> at scbus0 target 1 lun 0 (pass1,da1)
> <ATA ST32000641AS CC13> at scbus0 target 3 lun 0 (pass2,da2)
> <ATA ST32000641AS CC13> at scbus0 target 4 lun 0 (pass3,da3)
> <ATA ST32000542AS CC34> at scbus0 target 5 lun 0 (pass4,da4)
> <ATA ST32000641AS CC13> at scbus0 target 6 lun 0 (pass5,da5)
> <ATA ST32000542AS CC37> at scbus0 target 7 lun 0 (pass6,da6)
>
> <ATA ST32000641AS CC13> at scbus2 target 0 lun 0 (pass7,da7)
> <ATA ST32000542AS CC34> at scbus2 target 1 lun 0 (pass8,da8)
> <ATA ST32000542AS CC37> at scbus2 target 2 lun 0 (pass9,da9)
> <ATA ST32000542AS CC34> at scbus2 target 3 lun 0 (pass10,da10)
> <ATA ST32000542AS CC34> at scbus2 target 4 lun 0 (pass11,da11)
> <ATA ST32000542AS CC37> at scbus2 target 5 lun 0 (pass12,da12)
> <ATA ST32000542AS CC34> at scbus2 target 6 lun 0 (pass13,da13)
> <ATA ST32000641AS CC13> at scbus2 target 7 lun 0 (da14,pass14)
>
> It should also be noted that I have to override the default SCSI timeout
> delay, in order to ensure proper detection of all devices at boot by
> putting the following in /boot/loader.conf :
> kern.cam.scsi_delay=15000
>
> I wanted to know if anyone had experienced the following problems, and
> found a way around them :
>
>
>
> 1) I can't run any detailed and meaningful SMART polls on disks
> belonging to these controllers. (execution logs as separate files)
>
> As can be seen I am running the latest available version of smartctl
> from the ports :
> http://www.yomi.darkbsd.org/~darksoul/eirei-no-za-broken-disk-smart-log.txt
Bad link, as are the rest in this post. Which version
of smartmontools are you using?
Doug Gilbert
> (Using the pass devices gives the same result)
>
> Only the "-d scsi" polling returns somewhat meaningful info whatsoever
> (disk serial number etc), but even that is error-inducing, as the disk
> was actually nearing death.
> Here is the full SMART log recovered from running the disk from a
> USB->SATA device :
> http://www.yomi.darkbsd.org/~darksoul/eirei-no-za-broken-disk-smart-log2.txt
>
> I actually have scripts to monitor that, but it obviously relies on
> smartctl being able to do its job, which it's not...
> (Also, this worked perfectly fine under 8-STABLE with "-d sat"...)
>
>
>
> 2) Also, less annoying but still a show-stopper sort of for any serious
> work requiring high availability :
> Any disk I/O freeze ends up locking the whole controller (and the whole
> ZFS pool...) until either the server crashes or the disk bails out,
> whichever comes first, really. (kernel log as separate file)
>
> http://www.yomi.darkbsd.org/~darksoul/eirei-no-za-mpt-timeout.txt
>
>
> Thanks for your time.
>
More information about the freebsd-scsi
mailing list