[patch] zfs livelock and thread priorities
Ben Kelly
ben at wanderview.com
Tue Apr 28 21:19:38 UTC 2009
On Apr 28, 2009, at 4:52 PM, Ben Kelly wrote:
> On Apr 28, 2009, at 2:11 PM, Artem Belevich wrote:
>> My system had eventually deadlocked overnight, though it took much
>> longer than before to reach that point.
>>
>> In the end I've got many many processes sleeping in zio_wait with no
>> disk activity whatsoever.
>> I'm not sure if that's the same issue or not.
>>
>> Here are stack traces for all processes -- http://pastebin.com/f364e1452
>> I've got the core saved, so if you want me to dig out some more info,
>> let me know if/how I could help.
>
> It looks like there is a possible deadlock between zfs_zget() and
> zfs_zinactive(). They both acquire a lock via
> ZFS_OBJ_HOLD_ENTER(). The zfs_zinactive() path can get called
> indirectly from within zio_done(). The zfs_zget() can in turn block
> waiting for zio_done()'s completion while holding the object lock.
>
> The following patch might help:
>
> http://www.wanderview.com/svn/public/misc/zfs/zfs_zinactive_deadlock.diff
>
> This simply bails out of the inactive processing if the object lock
> is already held. I'm not sure if this is 100% correct or not as it
> cannot verify there are references to the vnode. I also tried
> executing the zfs_zinactive() logic in a taskqueue to avoid the
> deadlock, but that caused other deadlocks to occur.
Sorry to reply to my own mail, but I came up with a better solution
that I think is correct. I just vref() the vnode and then vrele() it
again from a taskqueue to restart the zfs_zinactive() processing if
its still applicable.
The patch is updated in the same location above.
Thanks again.
- Ben
More information about the freebsd-current
mailing list