da2:ciss1:0:0:0): Periph destroyed
Sean Bruno
sbruno at freebsd.org
Wed Sep 2 19:05:51 UTC 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
On 09/02/15 12:02, Per olof Ljungmark wrote:
> On 2015-09-02 20:40, Sean Bruno wrote:
>>
>>
>> On 09/02/15 10:59, Per olof Ljungmark wrote:
>>> On 2015-09-02 19:23, Sean Bruno wrote:
>>>>
>>>>
>>>> On 09/02/15 09:30, Per olof Ljungmark wrote:
>>>>> Hi,
>>>>>
>>>>> Recent 10-STABLE, HP D2600 with 12 SATA drives in RAID10
>>>>> via a P812 controller, 7TB capacity as one volume, ZFS.
>>>>>
>>>>> If I pull a drive from the array, the following occurs and
>>>>> I am not sure about the logic here because the array is
>>>>> still intact and no data loss occurs.
>>>>>
>>>>> Despite that the volume is gone.
>>>>>
>>>>> # zpool clear imap cannot clear errors for imap: I/O error
>>>>>
>>>>> # zpool online imap da2 cannot online da2: pool I/O is
>>>>> currently suspended
>>>>>
>>>>> Only a reboot helped and then the pool came up just fine,
>>>>> no errors, but that is not exactly what you want on a
>>>>> production box.
>>>>>
>>>>> Did I miss something?
>>>>>
>>>>> Would geli_autodetach="NO" help?
>>>>>
>>>>> syslog output:
>>>>>
>>>>> Sep 2 17:55:19 <kern.crit> str kernel: ciss1: *** Hot-plug
>>>>> drive removed, Port=1E Box=1 Bay=2 SN= Z4Z2S9SD Sep 2
>>>>> 17:55:19 <kern.crit> str kernel: ciss1: *** Physical drive
>>>>> failure, Port=1E Box=1 Bay=2 Sep 2 17:55:19 <kern.crit>
>>>>> str kernel: ciss1: *** State change, logical drive 0, new
>>>>> state=REGENING Sep 2 17:55:19 <kern.crit> str kernel:
>>>>> ciss1: logical drive 0 (da2) changed status OK->interim
>>>>> recovery, spare status 0x21<configured> Sep 2 17:55:19
>>>>> <kern.crit> str kernel: ciss1: *** State change, logical
>>>>> drive 0, new state=NEEDS_REBUILD Sep 2 17:55:19
>>>>> <kern.crit> str kernel: ciss1: logical drive 0 (da2)
>>>>> changed status interim recovery->ready for recovery, spare
>>>>> status 0x11<configured,available> Sep 2 17:55:19
>>>>> <kern.crit> str kernel: da2 at ciss1 bus 0 scbus2 target 0
>>>>> lun 0 Sep 2 17:55:19 <kern.crit> str kernel: da2: <HP
>>>>> RAID 1(1+0) read> s/n PAGXQ0BRH1W0WA detached Sep 2
>>>>> 17:55:19 <kern.crit> str kernel: (da2:ciss1:0:0:0): Periph
>>>>> destroyed Sep 2 17:55:19 <user.notice> str devd: Executing
>>>>> 'logger -p kern.notice -t ZFS 'vdev is removed,
>>>>> pool_guid=13539160044045520113
>>>>> vdev_guid=1325849881310347579'' Sep 2 17:55:19
>>>>> <user.notice> str ZFS: vdev is removed,
>>>>> pool_guid=13539160044045520113
>>>>> vdev_guid=1325849881310347579 Sep 2 17:55:19 <kern.crit>
>>>>> str kernel: (da2:ciss1:0:0:0): fatal error, could not
>>>>> acquire reference count Sep 2 17:55:23 <kern.crit> str
>>>>> kernel: ciss1: *** State change, logical drive 0, new
>>>>> state=REBUILDING Sep 2 17:55:23 <kern.crit> str kernel:
>>>>> ciss1: logical drive 0 (da2) changed status ready for
>>>>> recovery->recovering, spare status
>>>>> 0x13<configured,rebuilding,available> Sep 2 17:55:23
>>>>> <kern.crit> str kernel: cam_periph_alloc: attempt to
>>>>> re-allocate valid device da2 rejected flags 0x18 refcount
>>>>> 1 Sep 2 17:55:23 <kern.crit> str kernel: daasync: Unable
>>>>> to attach to new device due to status 0x6
>>>>> _______________________________________________
>>>>> freebsd-scsi at freebsd.org mailing list
>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To
>>>>> unsubscribe, send any mail to
>>>>> "freebsd-scsi-unsubscribe at freebsd.org"
>>>>>
>>>>
>>>>
>>>> This looks like a bug I introduced at r249170. Now that I
>>>> stare deeply into the abyss of ciss(4), I think the entire
>>>> change is wrong.
>>>>
>>>> Do you want to try and revert that change from your kernel
>>>> and rebuild for a test? I don't have access to ciss(4)
>>>> hardware anylonger and cannot verify.
>>>>
>>
>>> Yes, I can try. The installed rev is 281826 but I assume the
>>> change can apply here too?
>>> _______________________________________________
>>> freebsd-scsi at freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To
>>> unsubscribe, send any mail to
>>> "freebsd-scsi-unsubscribe at freebsd.org"
>>
>>
>>
>> yeah, I think a "svn merge -c -249170" from /usr/src should do
>> it if you are managing your system from svn
>>
>
> Sep 2 20:54:05 <kern.crit> str kernel: ciss1: *** Hot-plug drive
> removed, Port=1E Box=1 Bay=3 SN= W4Z1G4BD Sep 2
> 20:54:05 <kern.crit> str kernel: ciss1: *** Physical drive
> failure, Port=1E Box=1 Bay=3 Sep 2 20:54:50 <kern.crit> str
> kernel: ciss1: *** Hot-plug drive inserted, Port=1E Box=1 Bay=3
> SN= WD-WMC1P0F66XVC Sep 2 20:54:50 <kern.crit> str kernel: ciss1:
> *** HP Array Controller Firmware Ver = 6.64, Build Num = 0
>
>
> Right, this time it survived, the volume did not detach after
> reverting.
>
> If this change does not cause any other problems do you think it
> can go into -STABLE?
>
> Thanks!
>
> //per _______________________________________________
> freebsd-scsi at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-scsi To
> unsubscribe, send any mail to
> "freebsd-scsi-unsubscribe at freebsd.org"
>
Definitely. I'll yank it out today and setup a 3 day MFC
sean
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQF8BAEBCgBmBQJV50iKXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRCQUFENDYzMkU3MTIxREU4RDIwOTk3REQx
MjAxRUZDQTFFNzI3RTY0AAoJEBIB78oecn5k/cUIAKwf3SitFqiXrW8ophSd8D2F
PHMAIUbRnH6vzAK6yGFmly/4oCaTfCj966hrFRFcCdzKbUAUge89O1ewdbuiSgY+
oF0Wkb6175ucZSYaiEzayp0N1dgewxVZGAFjhO+OXGMXftgR6yYmQDCuE3eFdaRE
zA4A+VwE0gKnQxOVBbrhzf8ezEfml+iDvYd/NxCciDhlNMrWhXUCgq9B4RBM6aU2
oYt1qNxrqkVvL9hV8u2/WAJd8Q6sDcaJnv2IcKoU8i/XzhQtsMtCk9juFAvGHQQb
HRI4iJpqtBwlhBLSzesIYKzMtfd1RRRLLOG8PHZZFl3RrinOSS02SbbxCa8lFrM=
=/2d0
-----END PGP SIGNATURE-----
More information about the freebsd-scsi
mailing list