mpr(4) bug ?

geoffroy desvernay dgeo at centrale-marseille.fr
Mon Dec 12 14:37:09 UTC 2016


Hi all,

First, I'm not fluently speaking SCSI nor kernel-c, so please don't byte
too hard if I'm missing something obvious :)

I tried some thing before posting here, from testing the hardware under
linux (it work flawlessly there) and vendor's tests software, changing
the adapter (for a different one, but with same chipset, that's all I
have), upgrading firmwares where available (card and dell enclosure),
trying to read mpr(4)'s code… well this is beyond my knowledge.

Hardware: dell PowerEgde R430 with an LSI SAS3008 card and an MD1420
enclosure with 24 2T Seagate sas drives.
 This machine also have an embedded SAS3008 (dell perc H330) in non-raid
mode (mrsas driver) with 4 SSD drives to be used as ZFS cache/log…

System: FreeBSD 11.0-RELEASE-p3

Please tell me if there are tests I could do, patches to try, or ?
Currently compiling 11-STABLE kernel with sys/dev/mpr from CURRENT, but
with no clues…

#pciconf -lv:
mpr0 at pci0:4:0:0:        class=0x010700 card=0x1f461028 chip=0x00971000
rev=0x02 hdr=0x00
    vendor     = 'LSI Logic / Symbios Logic'
    device     = 'SAS3008 PCI-Express Fusion-MPT SAS-3'
    class      = mass storage
    subclass   = SAS

Symptoms: any zpool create fails with
# zpool create ztest raidz da20 da21 da22
cannot create 'ztest': invalid argument for this pool operation

# dmesg show a buncf of messages like this one:
(da22:mpr0:0:26:0): READ(10). CDB: 28 00 e8 e0 88 af 00 00 01 00
(da22:mpr0:0:26:0): CAM status: SCSI Status Error
(da22:mpr0:0:26:0): SCSI status: Check Condition
(da22:mpr0:0:26:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid
command operation code)
(da22:mpr0:0:26:0): Error 22, Unretryable error

(see http://dgeo.perso.ec-m.fr/mpr_fail.txt for full related dmesg)

# camcontrol devlist 2 seems normal to me:
<ATA INTEL SSDSC2BX20 DL2B>        at scbus1 target 0 lun 0 (pass0,da0)
<ATA INTEL SSDSC2BX20 DL2B>        at scbus1 target 1 lun 0 (pass1,da1)
<ATA INTEL SSDSC2BA40 DL2B>        at scbus1 target 2 lun 0 (pass2,da2)
<ATA INTEL SSDSC2BA40 DL2B>        at scbus1 target 3 lun 0 (pass3,da3)
<DP BP13G+ 2.23>                   at scbus1 target 32 lun 0 (pass4,ses0)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 8 lun 0 (pass5,da4)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 9 lun 0 (pass6,da5)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 10 lun 0 (pass7,da6)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 11 lun 0 (pass8,da7)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 12 lun 0 (pass9,da8)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 13 lun 0 (pass10,da9)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 14 lun 0 (pass11,da10)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 15 lun 0 (pass12,da11)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 16 lun 0 (pass13,da12)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 17 lun 0 (pass14,da13)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 18 lun 0 (pass15,da14)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 19 lun 0 (pass16,da15)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 20 lun 0 (pass17,da16)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 21 lun 0 (pass18,da17)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 22 lun 0 (pass19,da18)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 23 lun 0 (pass20,da19)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 24 lun 0 (pass21,da20)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 25 lun 0 (pass22,da21)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 26 lun 0 (pass23,da22)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 27 lun 0 (pass24,da23)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 28 lun 0 (pass25,da24)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 29 lun 0 (pass26,da25)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 30 lun 0 (pass27,da26)
<SEAGATE ST2000NX0433 NS02>        at scbus2 target 31 lun 0 (pass28,da27)
<DELL MD1420 1.07>                 at scbus2 target 32 lun 0 (pass29,ses1)
<AHCI SGPIO Enclosure 1.00 0001>   at scbus7 target 0 lun 0 (pass30,ses2)

With dev.mpr.0.debug_level: 1023, I tried a simple dd test: dd reports
success if bs < 127k; fails if >= 128k (in both tests there are ILLEGAL
REQUEST in logs):
dd if=/tmp/rnd of=/dev/da20 bs=127k:
http://dgeo.perso.ec-m.fr/dd_bs_127k.debug.log

bs=128k: http://dgeo.perso.ec-m.fr/dd_bs_128k.debug.log


-- 
*geoffroy desvernay*
C.R.I - Administration systèmes et réseaux
Ecole Centrale de Marseille
Tel: (+33|0)4 91 05 45 24
Fax: (+33|0)4 91 05 44 26


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-scsi/attachments/20161212/cb0df294/attachment.sig>


More information about the freebsd-scsi mailing list