Not all disks come back after power cycling a JBOD

Alan Somers asomers at freebsd.org
Thu Apr 2 18:59:45 UTC 2020


On Thu, Apr 2, 2020 at 2:48 AM Andriy Gapon <avg at freebsd.org> wrote:

> On 30/03/2020 19:05, Alan Somers wrote:
> > If I remove a hot-swappable SCSI drive and reinsert it, FreeBSD always
> > seems to handle that just fine.  But if instead I unplug or power off an
> > entire JBOD, then reattach it, frequently FreeBSD fails to fails to
> > recreate all of the device nodes.  Using "mpsutil show devices" or
> "mprutil
> > show devices" I can see all of the devices that I'm expecting.  However,
> > "camcontrol devlist" doesn't show them, and "camcontrol rescan" doesn't
> > help.
> >
> > This has been the situation for as long as I can remember, several years
> at
> > least.  But now it's starting to cause problems for me.  Before I try to
> > debug this myself, does anybody know anything about the problem?
>
> I have been trying to help a user with this problem with mpr driver.
> It seemed that the problem happened at the controller or expander level.
> At least, I could not see any problem with the driver.
>
> Some things we saw:
> - the problem could be reproduced with Linux as well
> - it was always the same slots / expander ports that could get the problem
>
> We collected logs after doing these things:
> - dev.mpr.0.debug_level=0x6ff
> - camcontrol debug -I -P -c -p <bus>
>
> From what I could see in the logs affected disks were in permanent reset
> state
> and that's what the controller kept reporting.
> The driver kept getting SasTopologyChangeList events where the affected
> disks
> kept oscillating between PHYLinkStatusChange and TargetMissing.
> E.g., PHY 3 and 5 here:
> EventDataLength: 6
> AckRequired: 0
> Event: SasTopologyChangeList (0x1c)
> EventContext: 0x0
> EnclosureHandle: 0x2
> ExpanderDevHandle: 0x9
> NumPhys: 39
> NumEntries: 3
> StartPhyNum: 3
> ExpStatus: Responding (0x3)
> PhysicalPort: 0
> PHY[3].AttachedDevHandle: 0x000d
> PHY[3].LinkRate: 12.0Gbps (0xb0)
> PHY[3].PhyStatus: PHYLinkStatusChange
> PHY[4].AttachedDevHandle: 0x000e
> PHY[4].LinkRate: 12.0Gbps (0xbb)
> PHY[4].PhyStatus: PHYLinkStatusUnchanged
> PHY[5].AttachedDevHandle: 0x000f
> PHY[5].LinkRate: 12.0Gbps (0xb0)
> PHY[5].PhyStatus: PHYLinkStatusChange
>
> EventDataLength: 6
> AckRequired: 0
> Event: SasTopologyChangeList (0x1c)
> EventContext: 0x0
> EnclosureHandle: 0x2
> ExpanderDevHandle: 0x9
> NumPhys: 39
> NumEntries: 3
> StartPhyNum: 3
> ExpStatus: Responding (0x3)
> PhysicalPort: 0
> PHY[3].AttachedDevHandle: 0x000d
> PHY[3].LinkRate: LinkRate Unknown (0xb)
> PHY[3].PhyStatus: TargetMissing
> PHY[4].AttachedDevHandle: 0x000e
> PHY[4].LinkRate: 12.0Gbps (0xbb)
> PHY[4].PhyStatus: PHYLinkStatusUnchanged
> PHY[5].AttachedDevHandle: 0x000f
> PHY[5].LinkRate: LinkRate Unknown (0xb)
> PHY[5].PhyStatus: TargetMissing
>
> There were also SasDeviceStatusChange like this:
> mpr0: EventReply    :
>     EventDataLength: 7
>     AckRequired: 0
>     Event: SasDeviceStatusChange (0xf)
>     EventContext: 0x20
>     TaskTag: 0xffff
>     ReasonCode: Internal Device Reset
>     ASC: 0x0
>     ASCQ: 0x0
>     DevHandle: 0x20
>     SASAddress: 0x5000cca2584a54cd
>
> mpr0: EventReply    :
>     EventDataLength: 7
>     AckRequired: 0
>     Event: SasDeviceStatusChange (0xf)
>     EventContext: 0x20
>     TaskTag: 0xffff
>     ReasonCode: Cmp Internal Device Reset
>     ASC: 0x0
>     ASCQ: 0x0
>     DevHandle: 0x20
>     SASAddress: 0x5000cca2584a54cd
>
> Finally, the user discovered that after sas3flash -reset the controller
> (and
> FreeBSD) is able to see all disks again.
>
> If anyone has any thoughts / suggestions they are very welcome!
>

Thanks for the tip, avg!  sas2flash -reset worked.  At least, it worked for
the case where "mpsutil show devices" shows missing devices.  There was one
case where "mprutil show devices" looked fine.  But I haven't been able to
reproduce that failure yet.  I'll let you know if I ever do.  In the
meantime, I'll add sas2flash/sas3flash to my toolkit.
-Alan


More information about the freebsd-scsi mailing list