How to force a reset of a device (disk) in an enclosre slot

Kenneth D. Merry ken at FreeBSD.ORG
Sat Sep 15 03:28:34 UTC 2012


On Sat, Sep 15, 2012 at 03:13:05 +0000, John wrote:
> ----- Kenneth D. Merry's Original Message -----
> > On Sat, Sep 15, 2012 at 02:24:37 +0000, John wrote:
> > > Hi Folks,
> > > 
> > >    I've been poking around and can't seem to find a way to reset and
> > > hopefully acquire access to a disk device in an enclosure. For instance:
> > > 
> > > FreeBSD 9.1-PRERELEASE
> > > 
> > > # camcontrol smpphylist ses4
> > > 37 PHYs:
> > > PHY  Attached SAS Address
> > >   0  0x5000039368233602   <HP EG0600FBDSR HPD4>             (pass105,da98)
> > >   1  0x5000039368238e3e   <HP EG0600FBDSR HPD4>             (pass106,da99)
> > >   2  0x500003936823bca2   <HP EG0600FBDSR HPD4>             (pass107,da100)
> > >   3  0x500003936819507e   <HP EG0600FBDSR HPD4>             (pass108,da101)
> > >   4  0x5000039368197d5a   <HP EG0600FBDSR HPD4>             (pass109,da102)
> > >   5  0x5000039368197c6e   <HP EG0600FBDSR HPD4>             (pass110,da103)
> > >   6  0x500003936818770e   <HP EG0600FBDSR HPD2>             (pass111,da104)
> > >   7  0x5000039368238eba   <HP EG0600FBDSR HPD4>             (pass112,da105)
> > >   8  0x5000039368232f42   <HP EG0600FBDSR HPD4>             (pass113,da106)
> > >   9  0x0000000000000000
> > >  10  0x500003936813c31e
> > >  11  0x5000039368233892   <HP EG0600FBDSR HPD4>             (pass114,da107)
> > >  12  0x500003936813c2ca   <HP EG0600FBDSR HPD4>             (pass115,da108)
> > > ...
> > > 
> > > Note, bay/slot 10 has a listed device address. If I were to pull the
> > > drive and re-insert it, it would show up (as da390 in this case).
> > > The above is after a fresh reboot. Note da106 to da107 skipping
> > > slot 10 (slot 9 is empty).
> > > 
> > > The smp utils provide a similar view:
> > > 
> > > # smp_discover /dev/ses4 
> > >   phy   0:D:attached:[5000039368233602:00  t(SSP)]  6 Gbps
> > >   phy   1:D:attached:[5000039368238e3e:00  t(SSP)]  6 Gbps
> > >   phy   2:D:attached:[500003936823bca2:00  t(SSP)]  6 Gbps
> > >   phy   3:D:attached:[500003936819507e:00  t(SSP)]  6 Gbps
> > >   phy   4:D:attached:[5000039368197d5a:00  t(SSP)]  6 Gbps
> > >   phy   5:D:attached:[5000039368197c6e:00  t(SSP)]  6 Gbps
> > >   phy   6:D:attached:[500003936818770e:00  t(SSP)]  6 Gbps
> > >   phy   7:D:attached:[5000039368238eba:00  t(SSP)]  6 Gbps
> > >   phy   8:D:attached:[5000039368232f42:00  t(SSP)]  6 Gbps
> > >   phy  10:D:attached:[500003936813c31e:00  t(SSP)]  6 Gbps
> > >   phy  11:D:attached:[5000039368233892:00  t(SSP)]  6 Gbps
> > >   phy  12:D:attached:[500003936813c2ca:00  t(SSP)]  6 Gbps
> > > ...
> > > 
> > > The address of slot 10 matches. There is a disk in the slot - just
> > > isn't recognized and attached.
> > > 
> > > Back to the basic question. How can I issue a command to the enclosure
> > > to force a re-initialization of the device to recover it without
> > > having to physically pull & insert it. Even if the device numbers
> > > are not sequential, I need access to the drive...
> > 
> > You can try sending a link reset:
> > 
> > camcontrol smppc ses4 -p 10 -o linkreset
> > 
> > It may or may not work.  You can also try disabling the PHY (-o disable)
> > and then sending a link reset to re-enable the link.  You can also try a
> > hard reset (-o hardreset)
> 
> Hi Ken,
> 
> Well, I hadn't tried to actually disable the device. That did bring some
> reaction:
> 
> # camcontrol smppc ses4 -p 10 -o disable
> # camcontrol smpphylist ses4
> 37 PHYs:
> PHY  Attached SAS Address
>   0  0x5000039368233602   <HP EG0600FBDSR HPD4>             (pass105,da98)
> ....
>   8  0x5000039368232f42   <HP EG0600FBDSR HPD4>             (pass113,da106)
>   9  0x0000000000000000
>  10  0x0000000000000000
>  11  0x5000039368233892   <HP EG0600FBDSR HPD4>             (pass114,da107)
> ...
> 
> The device is gone.
> 
> # camcontrol smppc ses4 -p 10 -o hardreset
> root at vprzfs01p:/root # camcontrol smpphylist ses4
> 37 PHYs:
> PHY  Attached SAS Address
>   0  0x5000039368233602   <HP EG0600FBDSR HPD4>             (pass105,da98)
> ....
>   8  0x5000039368232f42   <HP EG0600FBDSR HPD4>             (pass113,da106)
>   9  0x0000000000000000
>  10  0x500003936813c31e
>  11  0x5000039368233892   <HP EG0600FBDSR HPD4>             (pass114,da107)
> ...
> 
> The device is back, but not attached - This msg:
> 
> kernel: mps1: mpssas_alloc_tm freezing simq
> kernel: mps1: mpssas_remove_complete on handle 0x0069, IOCStatus= 0x0
> kernel: mps1: mpssas_free_tm releasing simq
> kernel: _mapping_add_new_device: failed to add the device with handle 0x0069 to persistent table because there is no free space available - entry 0

That message is harmless, it won't prevent the drive from attaching.

> >From a debug statement in the driver: MaxPersistentEntries == 128, but I
> have more than 128 devices per LSI card and they normally all show up -
> though I do get a bunch of the above messages in dmesg..

You might try turning on some of the debugging in the mps(4) driver and
disabling and resetting the link again.

Try:

sysctl -w dev.mps.0.debug_level=0xf

You might get a lot of output, so be prepared to reset it back to 4:

sysctl -w dev.mps.0.debug_level=4

Ken
-- 
Kenneth Merry
ken at FreeBSD.ORG


More information about the freebsd-scsi mailing list