ctld(8) 11.2-release lockup with w2k16 [Was: Re: ctld(8), multiple 'portal-group' on same socket (individual 'discovery-auth-group' restrictions)]

Harry Schmalzbauer freebsd at omnilan.de
Thu Jul 5 16:17:21 UTC 2018


Am 21.10.2014 um 12:43 schrieb Edward Tomasz Napierała:
> On 1020T1035, Harald Schmalzbauer wrote:
>>   Hello,
>>
>> I'm trying to move from istgt(1) to ctld(8), but it seems my setup isn't
>> possible with ctld.
>> Besides missing support for virtual-DVDs ('UnitType DVD' in istgt) and
>> real ODD-devices ('UnitType pass' in istgt),
> Yup, we don't implement virtual DVDs and passthrough.  Especially the
> latter would be a nice feature to have.


Hello Edward,

my current problem is unrelated.
But this old mail illustrates the timeframe I've been happily using 
ctld(8) without problems :-) Thanks!

Recently, I discovered that WindowsServerBackup fails with Win2k16 
(never used 2k12).
Old initiators running 2008R2 (or ESXi 5.5) are still able to use 
ctld(8) ZVOL targets for WindowsServerBackup on 11.2-release without 
problems.

I haven't had time to do much analysis and I'm lacking skills/equipment 
to do them down at debugger level, but I wanted to ask if you're aware 
about problems with Windows Server 2016 as ctld(8) initiator.

The Symptoms:

The system locks up for about 30-60 seconds with iSCSI load from w2k16.
When the lockup happens, systat(1) shows 25% intr usage (which is one 
core) and not even the login session is responsive anymore. Neither 
updating userland-output nor reacting to input.
But, the input is queued and gets processed after the lockup releases.
The lockup vanishes as soon as iSCSI session was reset:
Jun 28 06:14:09 bansta kernel: WARNING: 172.24.32.172 
(iqn.1991-05.com.microsoft:dafus.mgn.mo1.psw-online.de): no ping reply 
(NOP-Out) after 5 seconds; dropping
connection
Jun 28 06:14:09 bansta kernel: WARNING: 172.24.32.172 
(iqn.1991-05.com.microsoft:dafus.mgn.mo1.psw-online.de): waiting for CTL 
to terminate 94 tasks
Jun 28 06:14:09 bansta kernel: WARNING: 172.24.32.172 
(iqn.1991-05.com.microsoft:dafus.mgn.mo1.psw-online.de): tasks terminated

Sometimes it's possible to transfer 30GB before the lockup happens, 
sometimes even a NTFS-quick-format leads to the lockup.


Yesterday I used istgt(1) instead of ctld(8) to export the exactly same 
ZVOL using the exactly same network backend, with exactly the same 
initiator.
The lockup hasn't occured anymore, the complete WindowsServerBackup taks 
finishes successfully on the Windows Server 2016 initiator.  So I 
strongly suspect a ctld(8) locking problem.
Like mentioned, target backed is a ZFS volume.  I already used a HDD as 
target backed (and observed a much better performance, which drops even 
if I use a UFS vnode backend on the same HDD), but I'm not sure anymore 
whether the lockup also occured...

For now I can't tell anything helpfuly, just describe the symptoms and 
ask if you have any hints for me what to try next to narrow down the 
problem, or if this is a already known problem.

Thanks,

-harry


More information about the freebsd-stable mailing list