Fwd: Re: Re: zfs deadlock
krichy at cflinux.hu
krichy at cflinux.hu
Mon Dec 16 00:27:10 UTC 2013
Dear devs,
I've managed to fix my issue somehow, please review the attached patch.
First, the traverse() call was made to conform to lock order described
in
kern/vfs_subr.c before vfs_busy(). Also, traverse() will return a locked
vnode in the event of success, even when there are no mounted
filesystems over the given vnode.
And last a deadlock race between zfsctl_snapdir_lookup() and
zfsctl_snapshot_inactive() is handled, which may need the most review,
as that may be buggy, or implement new bugs. This applies to stable/10
right now.
I am waiting on feedback.
Regards,
2013-12-11 11:43 időpontban krichy at cflinux.hu ezt írta:
> Dear devs,
>
> I have still have no success fixing these bugs, please help somehow. I
> currently dont understand the recursive lock problem, how should it be
> avoided.
>
> Thanks in advance,
>
> 2013-12-07 15:42 időpontban krichy at cflinux.hu ezt írta:
>> Dear Xin,
>>
>> I dont know if you read the -fs list or not, but there is a possible
>> bug in zfs snapshot handling, and unfortunately I cannot fix the
>> problem, but at least I could reproduce it.
>> Please have a look at it, and if I can help resolving it, i will.
>>
>> Regards,
>>
>> -------- Eredeti üzenet --------
>> Tárgy: Re: Re: zfs deadlock
>> Dátum: 2013-12-07 14:38
>> Feladó: krichy at cflinux.hu
>> Címzett: Steven Hartland <killing at multiplay.co.uk>
>> Másolat: freebsd-fs at freebsd.org
>>
>> Dear Steven,
>>
>> A crash is very easily reproducible with the attached script, just
>> make an empty dataset, make a snapshot of it,
>> and run the script.
>> In my virtual machine it crashed in a few seconds, producing the
>> attached output.
>>
>> Regards,
>> 2013-12-06 17:28 időpontban krichy at cflinux.hu ezt írta:
>>> Dear Steven,
>>>
>>> using the previously provided scripts, the bug still appears. And I
>>> got the attaches traces when the deadlock occured.
>>>
>>> It seems that one process is in zfs_mount(), while the other is in
>>> zfs_unmount_snap(). Look for the 'zfs' and 'ls' commands.
>>>
>>> Hope it helps.
>>>
>>> Regards,
>>> 2013-12-06 16:59 időpontban krichy at cflinux.hu ezt írta:
>>>> So maybe the force flag is too strict. Under linux the snapshots
>>>> remains mounted after a send.
>>>>
>>>> 2013-12-06 16:54 időpontban krichy at cflinux.hu ezt írta:
>>>>> Dear Steven,
>>>>>
>>>>> Of course. But I got further now. You mentioned that is normal that
>>>>> zfs send umounts snapshots. I dont know, but this indeed causes a
>>>>> problem:
>>>>>
>>>>> It is also reproducible without zfs send.
>>>>> 1. Have a large directory structure (just to make sure find runs
>>>>> long
>>>>> enough), make a snapshot of it.
>>>>> # cd /mnt/pool/set/.zfs/snapshot/snap
>>>>> # find .
>>>>>
>>>>> meanwhile, on another console
>>>>> # umount -f /mnt/pool/set/.zfs/snapshot/snap
>>>>>
>>>>> will cause a panic, or such.
>>>>>
>>>>> So effectively a regular user on a system can cause a crash.
>>>>>
>>>>> Regards,
>>>>>
>>>>> 2013-12-06 16:50 időpontban Steven Hartland ezt írta:
>>>>>> kernel compiled, installed and rebooted?
>>>>>> ----- Original Message ----- From: <krichy at cflinux.hu>
>>>>>> To: <smh at FreeBSD.org>
>>>>>> Sent: Friday, December 06, 2013 12:17 PM
>>>>>> Subject: Fwd: Re: zfs deadlock
>>>>>>
>>>>>>
>>>>>>> Dear shm,
>>>>>>>
>>>>>>> I've applied r258294 on top fo releng/9.2, but my test seems to
>>>>>>> trigger
>>>>>>> the deadlock again.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> -------- Eredeti üzenet --------
>>>>>>> Tárgy: Re: zfs deadlock
>>>>>>> Dátum: 2013-12-06 13:17
>>>>>>> Feladó: krichy at cflinux.hu
>>>>>>> Címzett: freebsd-fs at freebsd.org
>>>>>>>
>>>>>>> I've applied r258294 on top of releng/9.2, and using the attached
>>>>>>> scripts parallel, the system got into a deadlock again.
>>>>>>>
>>>>>>> 2013-12-06 11:35 időpontban Steven Hartland ezt írta:
>>>>>>>> Thats correct it unmounts the mounted snapshot.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Steve
>>>>>>>>
>>>>>>>> ----- Original Message ----- From: <krichy at cflinux.hu>
>>>>>>>> To: "Steven Hartland" <killing at multiplay.co.uk>
>>>>>>>> Cc: <freebsd-fs at freebsd.org>
>>>>>>>> Sent: Friday, December 06, 2013 8:50 AM
>>>>>>>> Subject: Re: zfs deadlock
>>>>>>>>
>>>>>>>>
>>>>>>>>> What is strange also, when a zfs send finishes, the paralell
>>>>>>>>> running
>>>>>>>>> find command issues errors:
>>>>>>>>>
>>>>>>>>> find: ./e/Chuje: No such file or directory
>>>>>>>>> find: ./e/singe: No such file or directory
>>>>>>>>> find: ./e/joree: No such file or directory
>>>>>>>>> find: ./e/fore: No such file or directory
>>>>>>>>> find: fts_read: No such file or directory
>>>>>>>>> Fri Dec 6 09:46:04 CET 2013 2
>>>>>>>>>
>>>>>>>>> Seems if the filesystem got unmounted meanwhile. But the script
>>>>>>>>> is
>>>>>>>>> changed its working directory to the snapshot dir.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> 2013-12-06 09:03 időpontban krichy at cflinux.hu ezt írta:
>>>>>>>>>> Dear Steven,
>>>>>>>>>>
>>>>>>>>>> While I was playig with zfs, trying to reproduce the previous
>>>>>>>>>> bug,
>>>>>>>>>> accidentaly hit another one, which caused a trace I attached.
>>>>>>>>>>
>>>>>>>>>> The snapshot contains directories in 2 depth, which contain
>>>>>>>>>> files. It
>>>>>>>>>> was to simulate a vmail setup, with domain/user hierarchy.
>>>>>>>>>>
>>>>>>>>>> I hope it is useful for someone.
>>>>>>>>>>
>>>>>>>>>> I used the attached two scripts to reproduce the ZFS bug.
>>>>>>>>>>
>>>>>>>>>> It definetly crashes the system, in the last 10 minutes it is
>>>>>>>>>> the 3rd
>>>>>>>>>> time.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> 2013-12-05 20:26 időpontban krichy at cflinux.hu ezt írta:
>>>>>>>>>>> Dear Steven,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for your reply. Do you know how to reproduce the bug?
>>>>>>>>>>> Because
>>>>>>>>>>> simply sending a snapshot which is mounted does not
>>>>>>>>>>> automatically
>>>>>>>>>>> trigger the deadlock. Some special cases needed, or what?
>>>>>>>>>>> How to prove that the patch fixes this?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> 2013-12-05 19:39 időpontban Steven Hartland ezt írta:
>>>>>>>>>>>> Known issue you want:
>>>>>>>>>>>> http://svnweb.freebsd.org/changeset/base/258595
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> Steve
>>>>>>>>>>>>
>>>>>>>>>>>> ----- Original Message ----- From: "Richard Kojedzinszky"
>>>>>>>>>>>> <krichy at cflinux.hu>
>>>>>>>>>>>> To: <freebsd-fs at freebsd.org>
>>>>>>>>>>>> Sent: Thursday, December 05, 2013 2:56 PM
>>>>>>>>>>>> Subject: zfs deadlock
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear fs devs,
>>>>>>>>>>>>>
>>>>>>>>>>>>> We have a freenas server, which is basicaly a freebsd. I
>>>>>>>>>>>>> was
>>>>>>>>>>>>> trying to look at snapshots using ls .zfs/snapshot/.
>>>>>>>>>>>>>
>>>>>>>>>>>>> When I issued it, the system entered a deadlock. An NFSD
>>>>>>>>>>>>> was
>>>>>>>>>>>>> running, a zfs send was running when I issued the command.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I attached to command outputs while the system was in a
>>>>>>>>>>>>> deadlock
>>>>>>>>>>>>> state. I tried to issue
>>>>>>>>>>>>> # reboot -q
>>>>>>>>>>>>> But that did not restart the system. After a while (5-10
>>>>>>>>>>>>> minutes)
>>>>>>>>>>>>> the system rebooted, I dont know if the deadman caused
>>>>>>>>>>>>> that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now the system is up and running.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It is basically a freebsd 9.2 kernel.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do someone has a clue?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kojedzinszky Richard
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> freebsd-fs at freebsd.org mailing list
>>>>>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>>>>>>>>>>>> To unsubscribe, send any mail to
>>>>>>>>>>>>> "freebsd-fs-unsubscribe at freebsd.org"
>>>>>>>>>>>>
>>>>>>>>>>>> ================================================
>>>>>>>>>>>> This e.mail is private and confidential between Multiplay
>>>>>>>>>>>> (UK) Ltd.
>>>>>>>>>>>> and the person or entity to whom it is addressed. In the
>>>>>>>>>>>> event of
>>>>>>>>>>>> misdirection, the recipient is prohibited from using,
>>>>>>>>>>>> copying,
>>>>>>>>>>>> printing or otherwise disseminating it or any information
>>>>>>>>>>>> contained
>>>>>>>>>>>> in
>>>>>>>>>>>> it.
>>>>>>>>>>>>
>>>>>>>>>>>> In the event of misdirection, illegible or incomplete
>>>>>>>>>>>> transmission
>>>>>>>>>>>> please telephone +44 845 868 1337
>>>>>>>>>>>> or return the E.mail to postmaster at multiplay.co.uk.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ================================================
>>>>>>>> This e.mail is private and confidential between Multiplay (UK)
>>>>>>>> Ltd.
>>>>>>>> and the person or entity to whom it is addressed. In the event
>>>>>>>> of
>>>>>>>> misdirection, the recipient is prohibited from using, copying,
>>>>>>>> printing or otherwise disseminating it or any information
>>>>>>>> contained in
>>>>>>>> it.
>>>>>>>>
>>>>>>>> In the event of misdirection, illegible or incomplete
>>>>>>>> transmission
>>>>>>>> please telephone +44 845 868 1337
>>>>>>>> or return the E.mail to postmaster at multiplay.co.uk.
>>>>>>
>>>>>>
>>>>>> ================================================
>>>>>> This e.mail is private and confidential between Multiplay (UK)
>>>>>> Ltd.
>>>>>> and the person or entity to whom it is addressed. In the event of
>>>>>> misdirection, the recipient is prohibited from using, copying,
>>>>>> printing or otherwise disseminating it or any information
>>>>>> contained in
>>>>>> it.
>>>>>>
>>>>>> In the event of misdirection, illegible or incomplete transmission
>>>>>> please telephone +44 845 868 1337
>>>>>> or return the E.mail to postmaster at multiplay.co.uk.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: zfs-deadlock-1.patch
Type: text/x-diff
Size: 3563 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20131216/45290b16/attachment.patch>
More information about the freebsd-fs
mailing list