Disk reordering on LSI SAS2008/mps(4)

Kevin Bowling kevin.bowling at kev009.com
Mon Sep 11 23:52:46 UTC 2017


https://svnweb.freebsd.org/base?view=revision&revision=323384 fixed it for us

On Mon, Aug 28, 2017 at 8:08 AM, Stephen Mcconnell
<stephen.mcconnell at broadcom.com> wrote:
> I'm assuming that the dubug_level is in hex, right? If it is than the run
> where debug_level is 0x583 should be showing some Mapping debug output,
> but I don't see any. Do you have mapping enabled in the controller? You
> can see the mapping flags in IOC Page 8 in the Flags field. Do you have a
> way to look at the controller pages? You'd need either lsiutil or maybe
> mpsutil will work (Scott Long wrote mpsutil and I don't know anything
> about it).
>
> If you don't have mapping enabled, you won't be guaranteed that the
> devices will be discovered in the same order over a controller reset or
> reboot.
>
> Steve
>
>> -----Original Message-----
>> From: owner-freebsd-scsi at freebsd.org [mailto:owner-freebsd-
>> scsi at freebsd.org] On Behalf Of Kevin Bowling
>> Sent: Sunday, August 27, 2017 8:51 PM
>> To: FreeBSD-scsi
>> Cc: John Baldwin
>> Subject: Re: Disk reordering on LSI SAS2008/mps(4)
>>
>> Note that we only see this bug with EARLY_AP_STARTUP enabled
>>
>> Regards,
>>
>> On Fri, Aug 25, 2017 at 2:11 PM, Jason Wolfe <j at nitrology.com> wrote:
>>
>> > Attachments are useful.
>> >
>> >
>> > On 2017-08-25 13:58, Jason Wolfe wrote:
>> >
>> >> Hi!
>> >>
>> >> We've been having an issue where we see some disk reordering on boot
>> >> on HEAD from mid July on LSI controllers, maybe 5% of the time. We
>> >> brought mps current as of r322364 with no change behavior.
>> >>
>> >> I have a few logs attached with various debug output. In all cases
>> >> I've seen the pass ordering to be proper, and cam does try to resolve
>> >> the da ordering, but the device it tries to reassign to is already
>> >> taken. Attached is the full output, and listing some relevant bits
>> >> below for the casual reader. Being that the functionality in
>> >> scsi_da.c has been fairly static, and it's attempting to reassign, it
>> >> seems more likely we are running into something in mps here. The
>> >> targets always look to be proper.
>> >>
>> >> The various settings of hw.mps.use_phy_num (-1/0/1) don't change the
>> >> behavior, and neither does hw.mps.enable_ssu=0. We have machines over
>> >> various FW versions (15/16) that see the issue. I'm wondering if the
>> >> fact that we see this issue over soft reboots means that the firmware
>> >> isn't coming into play. To confirm, we are booting from the
>> >> controller, so the LSI BIOS is enabled.
>> >>
>> >> mps0 at pci0:3:0:0:        class=0x010700 card=0x040015d9
> chip=0x00721000
>> >> rev=0x03 hdr=0x00
>> >>     vendor     = 'LSI Logic / Symbios Logic'
>> >>     device     = 'SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]'
>> >>     class      = mass storage
>> >>     subclass   = SAS
>> >>
>> >> reorder-verbose.txt:
>> >> boot_verbose="YES"
>> >> hw.mps.debug_level="71"
>> >>
>> >> da0 at mps0 bus 0 scbus0 target 17 lun 0
>> >> cam_periph_alloc: attempt to re-allocate valid device da0 rejected
>> >> flags 0x102 refcount 4
>> >> da1 at mps0 bus 0 scbus0 target 8 lun 0
>> >> daasync: Unable to attach to new device due to status 0x6
>> >> da2 at mps0 bus 0 scbus0 target 9 lun 0 ...
>> >> da8 at mps0 bus 0 scbus0 target 15 lun 0
>> >> da9 at mps0 bus 0 scbus0 target 16 lun 0
>> >> da10 at mps0 bus 0 scbus0 target 18 lun 0
>> >> da11 at mps0 bus 0 scbus0 target 19 lun 0
>> >>
>> >> pass0 at mps0 bus 0 scbus0 target 8 lun 0
>> >> pass1 at mps0 bus 0 scbus0 target 9 lun 0 ...
>> >> pass9 at mps0 bus 0 scbus0 target 17 lun 0
>> >> pass10 at mps0 bus 0 scbus0 target 18 lun 0
>> >> pass11 at mps0 bus 0 scbus0 target 19 lun 0
>> >>
>> >>
>> >>
>> >>
>> >> reorder-mps-mapping.txt:
>> >> hw.mps.debug_level="583"
>> >>
>> >> da0 at mps0 bus 0 scbus0 target 19 lun 0
>> >> da1 at mps0 bus 0 scbus0 target 8 lun 0
>> >> da2 at mps0 bus 0 scbus0 target 9 lun 0
>> >> ...
>> >> da9 at mps0 bus 0 scbus0 target 16 lun 0
>> >> da10 at mps0 bus 0 scbus0 target 17 lun 0
>> >> da11 at mps0 bus 0 scbus0 target 18 lun 0
>> >> cam_periph_alloc: attempt to re-allocate valid device da0 rejected
>> >> flags 0x106 refcount 6
>> >> daasync: Unable to attach to new device due to status 0x6
>> >>
>> >> ses0: da1,pass0: Element descriptor: 'Slot 01'
>> >> ses0: da1,pass0: SAS Device Slot Element: 1 Phys at Slot 0
>> >> ses0: da0,pass11: Element descriptor: 'Slot 12'
>> >> ses0: da0,pass11: SAS Device Slot Element: 1 Phys at Slot 11
>> >>
>> >>
>> >> Luckily we have found a way to fairly easily repro it over a few
>> >> hours, so we are open to any suggestions.
>> >>
>> >> Thanks!
>> >> Jason
>> >
>> >
>> > _______________________________________________
>> > freebsd-scsi at freebsd.org mailing list
>> > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
>> > To unsubscribe, send any mail to
> "freebsd-scsi-unsubscribe at freebsd.org"
>> >
>> _______________________________________________
>> freebsd-scsi at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi
>> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe at freebsd.org"


More information about the freebsd-scsi mailing list