How to force a reset of a device (disk) in an enclosre slot
Kenneth D. Merry
ken at FreeBSD.ORG
Sat Sep 15 03:28:34 UTC 2012
On Sat, Sep 15, 2012 at 03:13:05 +0000, John wrote:
> ----- Kenneth D. Merry's Original Message -----
> > On Sat, Sep 15, 2012 at 02:24:37 +0000, John wrote:
> > > Hi Folks,
> > >
> > > I've been poking around and can't seem to find a way to reset and
> > > hopefully acquire access to a disk device in an enclosure. For instance:
> > >
> > > FreeBSD 9.1-PRERELEASE
> > >
> > > # camcontrol smpphylist ses4
> > > 37 PHYs:
> > > PHY Attached SAS Address
> > > 0 0x5000039368233602 <HP EG0600FBDSR HPD4> (pass105,da98)
> > > 1 0x5000039368238e3e <HP EG0600FBDSR HPD4> (pass106,da99)
> > > 2 0x500003936823bca2 <HP EG0600FBDSR HPD4> (pass107,da100)
> > > 3 0x500003936819507e <HP EG0600FBDSR HPD4> (pass108,da101)
> > > 4 0x5000039368197d5a <HP EG0600FBDSR HPD4> (pass109,da102)
> > > 5 0x5000039368197c6e <HP EG0600FBDSR HPD4> (pass110,da103)
> > > 6 0x500003936818770e <HP EG0600FBDSR HPD2> (pass111,da104)
> > > 7 0x5000039368238eba <HP EG0600FBDSR HPD4> (pass112,da105)
> > > 8 0x5000039368232f42 <HP EG0600FBDSR HPD4> (pass113,da106)
> > > 9 0x0000000000000000
> > > 10 0x500003936813c31e
> > > 11 0x5000039368233892 <HP EG0600FBDSR HPD4> (pass114,da107)
> > > 12 0x500003936813c2ca <HP EG0600FBDSR HPD4> (pass115,da108)
> > > ...
> > >
> > > Note, bay/slot 10 has a listed device address. If I were to pull the
> > > drive and re-insert it, it would show up (as da390 in this case).
> > > The above is after a fresh reboot. Note da106 to da107 skipping
> > > slot 10 (slot 9 is empty).
> > >
> > > The smp utils provide a similar view:
> > >
> > > # smp_discover /dev/ses4
> > > phy 0:D:attached:[5000039368233602:00 t(SSP)] 6 Gbps
> > > phy 1:D:attached:[5000039368238e3e:00 t(SSP)] 6 Gbps
> > > phy 2:D:attached:[500003936823bca2:00 t(SSP)] 6 Gbps
> > > phy 3:D:attached:[500003936819507e:00 t(SSP)] 6 Gbps
> > > phy 4:D:attached:[5000039368197d5a:00 t(SSP)] 6 Gbps
> > > phy 5:D:attached:[5000039368197c6e:00 t(SSP)] 6 Gbps
> > > phy 6:D:attached:[500003936818770e:00 t(SSP)] 6 Gbps
> > > phy 7:D:attached:[5000039368238eba:00 t(SSP)] 6 Gbps
> > > phy 8:D:attached:[5000039368232f42:00 t(SSP)] 6 Gbps
> > > phy 10:D:attached:[500003936813c31e:00 t(SSP)] 6 Gbps
> > > phy 11:D:attached:[5000039368233892:00 t(SSP)] 6 Gbps
> > > phy 12:D:attached:[500003936813c2ca:00 t(SSP)] 6 Gbps
> > > ...
> > >
> > > The address of slot 10 matches. There is a disk in the slot - just
> > > isn't recognized and attached.
> > >
> > > Back to the basic question. How can I issue a command to the enclosure
> > > to force a re-initialization of the device to recover it without
> > > having to physically pull & insert it. Even if the device numbers
> > > are not sequential, I need access to the drive...
> >
> > You can try sending a link reset:
> >
> > camcontrol smppc ses4 -p 10 -o linkreset
> >
> > It may or may not work. You can also try disabling the PHY (-o disable)
> > and then sending a link reset to re-enable the link. You can also try a
> > hard reset (-o hardreset)
>
> Hi Ken,
>
> Well, I hadn't tried to actually disable the device. That did bring some
> reaction:
>
> # camcontrol smppc ses4 -p 10 -o disable
> # camcontrol smpphylist ses4
> 37 PHYs:
> PHY Attached SAS Address
> 0 0x5000039368233602 <HP EG0600FBDSR HPD4> (pass105,da98)
> ....
> 8 0x5000039368232f42 <HP EG0600FBDSR HPD4> (pass113,da106)
> 9 0x0000000000000000
> 10 0x0000000000000000
> 11 0x5000039368233892 <HP EG0600FBDSR HPD4> (pass114,da107)
> ...
>
> The device is gone.
>
> # camcontrol smppc ses4 -p 10 -o hardreset
> root at vprzfs01p:/root # camcontrol smpphylist ses4
> 37 PHYs:
> PHY Attached SAS Address
> 0 0x5000039368233602 <HP EG0600FBDSR HPD4> (pass105,da98)
> ....
> 8 0x5000039368232f42 <HP EG0600FBDSR HPD4> (pass113,da106)
> 9 0x0000000000000000
> 10 0x500003936813c31e
> 11 0x5000039368233892 <HP EG0600FBDSR HPD4> (pass114,da107)
> ...
>
> The device is back, but not attached - This msg:
>
> kernel: mps1: mpssas_alloc_tm freezing simq
> kernel: mps1: mpssas_remove_complete on handle 0x0069, IOCStatus= 0x0
> kernel: mps1: mpssas_free_tm releasing simq
> kernel: _mapping_add_new_device: failed to add the device with handle 0x0069 to persistent table because there is no free space available - entry 0
That message is harmless, it won't prevent the drive from attaching.
> >From a debug statement in the driver: MaxPersistentEntries == 128, but I
> have more than 128 devices per LSI card and they normally all show up -
> though I do get a bunch of the above messages in dmesg..
You might try turning on some of the debugging in the mps(4) driver and
disabling and resetting the link again.
Try:
sysctl -w dev.mps.0.debug_level=0xf
You might get a lot of output, so be prepared to reset it back to 4:
sysctl -w dev.mps.0.debug_level=4
Ken
--
Kenneth Merry
ken at FreeBSD.ORG
More information about the freebsd-scsi
mailing list