Strange FreeBSD lock (related to iSCSI initiator connected to NetBSD istgt ?)

From: BERTRAND_Joël <joel.bertrand_at_systella.fr>
Date: Mon, 15 Jul 2024 14:13:44 UTC
	Hello,

	On my network, all workstations are diskless (Linux, FreeBSD, NetBSD,
even OpenVMS). Main server runs NetBSD 10.0 and acts as:
- boot server
- nfs server (/ and /home)
- iSCSI (for swap)

	Pythagore runs FreeBSD 14.0 and is a diskless workstation.
	Legendre runs NetBSD 10.0 and is network's main server.

	FreeBSD workstation randomly hangs. On server side, I can see in this
workstation's logfile:

Jul 15 15:54:16 pythagore kernel: WARNING: 192.168.10.128
(iqn.2020-02.fr.systella.legendre.istgt:pythagore): no ping reply
(NOP-In) after 5 seconds; reconnecting
Jul 15 15:54:31 pythagore kernel: swap_pager: indefinite wait buffer:
bufobj: 0, blkno: 3242023, size: 32768
Jul 15 15:54:32 pythagore kernel: swap_pager: indefinite wait buffer:
bufobj: 0, blkno: 3295147, size: 8192
Jul 15 15:54:32 pythagore kernel: swap_pager: indefinite wait buffer:
bufobj: 0, blkno: 3294054, size: 24576
Jul 15 15:54:32 pythagore kernel: swap_pager: indefinite wait buffer:
bufobj: 0, blkno: 2691989, size: 4096
Jul 15 15:54:33 pythagore kernel: swap_pager: indefinite wait buffer:
bufobj: 0, blkno: 3185488, size: 4096
Jul 15 15:54:33 pythagore kernel: swap_pager: indefinite wait buffer:
bufobj: 0, blkno: 3244176, size: 4096
Jul 15 15:54:33 pythagore kernel: swap_pager: indefinite wait buffer:
bufobj: 0, blkno: 1477832, size: 12288
...
Jul 15 15:55:17 pythagore kernel: WARNING: 192.168.10.128
(iqn.2020-02.fr.systella.legendre.istgt:pythagore): login timed out
after 61 seconds; reconnecting
...

	OK. I understand. iSCSI FreeBSD client was disconnected and, as FreeBSD
tries to do some operation on swap, kernel hangs.

	On server side, I have :

Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:
765:istgt_iscsi_read_pdu: ***ERROR*** readv() failed
(-1,errno=54,iqn.1994-09.org.freebsd:pythagore,time=0)
Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:5685:worker:
***ERROR*** conn->state = 1
Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:5702:worker:
***ERROR*** iscsi_read_pdu() failed
Jul 15 15:54:17 legendre istgt[26125]:
istgt_iscsi.c:1260:istgt_iscsi_write_pdu_internal: ***ERROR*** writev()
failed (errno=32,iqn.1994-09.org.freebsd:pythagore,time=0)
Jul 15 15:54:17 legendre istgt[26125]:
istgt_iscsi.c:3484:istgt_iscsi_transfer_in_internal: ***ERROR***
iscsi_write_pdu() failed
Jul 15 15:54:17 legendre istgt[26125]:
istgt_iscsi.c:3853:istgt_iscsi_task_response: ***ERROR***
iscsi_transfer_in() failed
Jul 15 15:54:17 legendre istgt[26125]: istgt_iscsi.c:5389:sender:
***ERROR*** iscsi_task_response() CmdSN=552123 failed on
iqn.2020-02.fr.systella.legendre.istgt:pythagore,t,0x0001(iqn.1994-09.org.freebsd:pythagore,i,0x80f5c96f41e6)

	All others workstations (mainly Linux and NetBSD) run without trouble.
Only FreeBSD triggers this issue. I suppose this bug is relative to
FreeBSD, but I'm not sure.

	Help to fix it will be welcome.

	Best regards,

	JB