panic: _sx_xlock_hard: recursed on non-recursive sx zfsvfs->z_hold_mtx[i] @ ...cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1407

Fri Oct 5 10:50:54 UTC 2012

on 03/10/2012 11:42 Andriy Gapon said the following:
> on 30/09/2012 15:24 Konstantin Belousov said the following:
>> The postponing of the reclaim when vnode reserve goes low to the vnlru 
>> process does not solve anything, since you only change the recursion into 
>> the deadlock.
>>
>> I discussed an approach for this issue with avg. Basic idea is presented in
>> the untested patch below. You can specify that some count of the free
>> vnodes must be present for some dynamic scope, started by 
>> getnewvnode_reserve() function. While staying inside the reserved pool,
>> getnewvnode() calls would not recurse into vnlru(). The scope is finished
>> with getnewvnode_drop_reserve().
>>
>> The getnewvnode_reserve() shall be called while no locks are held.
>>
>> What do you think ?
> 
> Here is a patch that makes use of the getnewvnode_reserve API in ZFS:
> http://people.freebsd.org/~avg/zfs-getnewvnode.diff
> 

BTW, my impression is that this problem is a comeback of the original
zfs_reclaim problem, but now in zfs_inactive (thanks to help from nullfs).
So, with the approach that Kostik designed and which fixes zfs_inactive,
zfs_reclaim should now be safe and should no longer require the taskqueue hack.
There should never be a recursion back into ZFS via getnewvnode.

-- 
Andriy Gapon