[Bug 211990] iscsi fails to reconnect and does not release devices

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Thu Nov 10 09:57:56 UTC 2016


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211990

--- Comment #18 from Julien Cigar <julien at perdition.city> ---
Problem appeared again today, after ~15 days of uptime, always on FreeBSD
filer1.prod.lan 10.3-RELEASE-p11 FreeBSD 10.3-RELEASE-p11 #0: Mon Oct 24
18:49:24 UTC 2016    
root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64

WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): no ping reply (NOP-In)
after 5 seconds; reconnecting
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In)
after 5 seconds; reconnecting
(da3:iscsi1:0:0:0): READ(10). CDB: 28 00 01 ef ec 90 00 00 01 00 
(da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 01 ef ec 8e 00 00 01 00 
(da3:(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
iscsi1:0:(da2:0:iscsi2:0:0): 0:Retrying command
0): Retrying command
da3 at iscsi1 bus 0 scbus4 target 0 lun 0
da3: <FREEBSD CTLDISK 0001> s/n MYSERIAL   0 detached
da2 at iscsi2 bus 0 scbus3 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> s/n MYSERIAL   1 detached
(da2:iscsi2:0:0:0): Periph destroyed
(da3:iscsi1:0:0:0): Periph destroyed
da2 at iscsi2 bus 0 scbus3 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-4 SCSI device
da2: Serial Number MYSERIAL   1
da2: 150.000MB/s transfers
da2: Command Queueing enabled
da2: 1840144MB (471076881 4096 byte sectors)
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In)
after 5 seconds; reconnecting
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 1c 14 10 0f 00 00 01 00 
(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): Retrying command
da2 at iscsi2 bus 0 scbus3 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> s/n MYSERIAL   1 detached
(da2:iscsi2:0:0:0): Periph destroyed
da2 at iscsi2 bus 0 scbus3 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-4 SCSI device
da2: Serial Number MYSERIAL   1
da2: 150.000MB/s transfers
da2: Command Queueing enabled
da2: 1840144MB (471076881 4096 byte sectors)
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): login timed out after 6
seconds; reconnecting
da3 at iscsi1 bus 0 scbus4 target 0 lun 0
da3: <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-4 SCSI device
da3: Serial Number MYSERIAL   0
da3: 150.000MB/s transfers
da3: Command Queueing enabled
da3: 1840144MB (471076881 4096 byte sectors)
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In)
after 5 seconds; reconnecting
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 1c 14 10 0f 00 00 01 00 
(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): Retrying command
da2 at iscsi2 bus 0 scbus3 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> s/n MYSERIAL   1 detached
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In)
after 5 seconds; reconnecting
(da2:iscsi2:0:0:0): Periph destroyed
da2 at iscsi2 bus 0 scbus3 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-4 SCSI device
da2: Serial Number MYSERIAL   1
da2: 150.000MB/s transfers
da2: Command Queueing enabled
da2: 1840144MB (471076881 4096 byte sectors)
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): handoff on already
connected session
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): connection error;
reconnecting
da3 at iscsi1 bus 0 scbus4 target 0 lun 0
da3: <FREEBSD CTLDISK 0001> s/n MYSERIAL   0 detached
(da3:iscsi1:0:0:0): Periph destroyed
da3 at iscsi1 bus 0 scbus4 target 0 lun 0
da3: <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-4 SCSI device
da3: Serial Number MYSERIAL   0
da3: 150.000MB/s transfers
da3: Command Queueing enabled
da3: 1840144MB (471076881 4096 byte sectors)

After a zpool online, and with vfs.zfs.scrub_delay = 0 and
vfs.zfs.resilver_delay = 0 I issued a zpool scrub and again I had a timeout:

WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target0): no ping reply (NOP-In)
after 5 seconds; reconnecting
WARNING: 10.20.30.32 (iqn.2016-08.lan.prod:target1): no ping reply (NOP-In)
after 5 seconds; reconnecting
(da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 9a 67 00 00 01 00 
(da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 98 3c 00 00 01 00 
(da3:(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
iscsi1:0:(da2:0:iscsi2:0:0): 0:Retrying command
0): Retrying command
(da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 9f e4 00 00 01 00 
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 a8 cc 00 00 01 00 
(da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
(da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command
Retrying command
(da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 a2 42 00 00 01 00 
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 95 6e 00 00 20 00 
(da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
(da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command
Retrying command
(da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 96 4e 00 00 20 00 
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 95 8e 00 00 20 00 
(da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
(da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command
Retrying command
(da3:iscsi1:0:0:0): READ(10). CDB: 28 00 00 b9 96 6e 00 00 20 00 
(da2:iscsi2:0:0:0): READ(10). CDB: 28 00 00 b9 a9 b2 00 00 01 00 
(da3:iscsi1:0:0:0): CAM status: CCB request aborted by the host
(da2:iscsi2:0:0:0): CAM status: CCB request aborted by the host
(da3:(da2:iscsi1:0:iscsi2:0:0:0:0): 0): Retrying command
Retrying command
da3 at iscsi1 bus 0 scbus4 target 0 lun 0
da3: <FREEBSD CTLDISK 0001> s/n MYSERIAL   0 detached
da2 at iscsi2 bus 0 scbus3 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> s/n MYSERIAL   1 detached
(da3:iscsi1:0:0:0): Periph destroyed
(da2:iscsi2:0:0:0): Periph destroyed
da2 at iscsi1 bus 0 scbus4 target 0 lun 0
da2: <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-4 SCSI device
da2: Serial Number MYSERIAL   0
da2: 150.000MB/s transfers
da2: Command Queueing enabled
da2: 1840144MB (471076881 4096 byte sectors)
da3 at iscsi2 bus 0 scbus3 target 0 lun 0
da3: <FREEBSD CTLDISK 0001> Fixed Direct Access SPC-4 SCSI device
da3: Serial Number MYSERIAL   1
da3: 150.000MB/s transfers
da3: Command Queueing enabled
da3: 1840144MB (471076881 4096 byte sectors)

I've raised those timeouts a little bit:
kern.iscsi.login_timeout: 30
kern.iscsi.iscsid_timeout: 30
kern.iscsi.ping_timeout: 30

and see if it makes any difference

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the freebsd-scsi mailing list