lock violation in unionfs (9.0-STABLE r230270)
Attilio Rao
attilio at freebsd.org
Fri Nov 2 14:21:20 UTC 2012
On Wed, Oct 31, 2012 at 11:11 AM, Harald Schmalzbauer
<h.schmalzbauer at omnilan.de> wrote:
> schrieb Attilio Rao am 29.10.2012 23:02 (localtime):
>> On Mon, Oct 29, 2012 at 7:37 PM, Harald Schmalzbauer
>> <h.schmalzbauer at omnilan.de> wrote:
>>> schrieb Attilio Rao am 27.10.2012 23:07 (localtime):
>>>> On Sat, Oct 27, 2012 at 9:46 PM, Attilio Rao <attilio at freebsd.org> wrote:
>>>>> On Sat, Sep 8, 2012 at 12:48 AM, Attilio Rao <attilio at freebsd.org> wrote:
>>>>>> On Thu, Sep 6, 2012 at 4:52 PM, Harald Schmalzbauer
>>>>>> <h.schmalzbauer at omnilan.de> wrote:
>>>>>>> schrieb Attilio Rao am 09.08.2012 20:26 (localtime):
>>>>>>>> On 8/8/12, Harald Schmalzbauer <h.schmalzbauer at omnilan.de> wrote:
>>>>>>>>> schrieb Pavel Polyakov am 06.03.2012 11:20 (localtime):
>>>>>>>>>>>> mount -t unionfs -o noatime /usr /mnt
>>>>>>>>>>>>
>>>>>>>>>>>> insmntque: mp-safe fs and non-locked vp: 0xfffffe01d96704f0 is not
>>>>>>>>>>>> exclusive locked but should be
>>>>>>>>>>>> KDB: enter: lock violation
>>>>>>>>>>> Pavel,
>>>>>>>>>>> can you give a spin to this patch?:
>>>>>>>>>>> http://www.freebsd.org/~attilio/unionfs_missing_insmntque_lock.patch
>>>>>>>>>>>
>>>>>>>>>>> I think that the unlocking is due at that point as the vnode lock can
>>>>>>>>>>> be switch later on.
>>>>>>>>>>>
>>>>>>>>>>> Let me know what you think about it and what the test does.
>>>>>>>>>> Thanks!
>>>>>>>>>> This patch fixes the problem with lock violation. Sorry I've tested it so
>>>>>>>>>> late.
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> this patch still applies cleanly to RELENG_9_1. Was there another fix
>>>>>>>>> for the issue or has it just not been PR-sent and thus forgotten?
>>>>>>>> Can you and Pavel try the attached patch? Unfortunately I had no time
>>>>>>>> to test it, I just made in 5 free mins from a non-FreeBSD workstation,
>>>>>>> Sorry, couldn't test earlier, but now I did:
>>>>>>> With this patch applied the machine hangs without debug kernel and the
>>>>>>> latter gives the following panic:
>>>>>>> System call nmount returning with the following locks held:
>>>>>>> exclusive lockmgr ufs (ufs) r = 0 (0xc5438278) locked @
>>>>>>> src/sys/fs/unionfs/union_vnops.c:1938
>>>>>>> panic: witness_warn
>>>>>>> cpuid = 0
>>>>>>> KDB: stack backtrace:
>>>>>>> db_trace_self_wrapper(c0a04f7f,c0c112c4,d1de3bb4,c097aa8c,fc,...) at
>>>>>>> db_trace_self_wrapper+0x26
>>>>>>> kdb_backtrace(c0a4965f,0,c09c2ede3c1c,0,...) at kdb_backtrace+0x2a
>>>>>>> witness_warn(2,0,c0a4ac34,c0a0990a,286,...) at witness_warn+0x1e4
>>>>>>> syscall(d1de3d08) ar syscall+0x415
>>>>>>> Xint0x80_syscall() at Xint0x80_syscall+0x21
>>>>>>> --- syscall (0, FreeBSD ELF32, nosys), eip = 0x280b883f,esp =
>>>>>>> 0xbfbfe46c, ebp = 0xbfbfede8 ---
>>>>>>> KDB: enter: panic
>>>>>>> [ thread pid 86 tid 100054 ]
>>>>>>> Stopped ad kdb_enter+0x3a: movl $0,kdb_why
>>>>>>> db> bt
>>>>>>> Tracing pid 86 tid 100054 td 0xc541b000
>>>>>>> kdb_enter(c0a00d16,c0a09130,0,0,0,...) at panix+0x190
>>>>>>> witness_warn(2,0,x0a4ac34,c0a0990a,286,...) at witness_warn+0x1e4
>>>>>>> syscall(d1de3d08) at syscall+0x415
>>>>>>> Xint0x80_syscall() at Xint0x80_syscall+0x21
>>>>>>>
>>>>>>> Hmm, I guess I forgot to install kernel debug symbols...
>>>>>>> Coming back if I have more
>>>>>> Unfortunately unionfs does very wrong things with the insmntque() locking.
>>>>>> It basically expects the vnode to return locked in the same way
>>>>>> requested by the precedent namei() (when that happens) but when you do
>>>>>> insmntque() you can only have an LK_EXCLUSIVE lock on the vnode.
>>>>> Hello,
>>>>> the following patch should workout the issues around unionfs_nodeget() a bit:
>>>>> http://www.freebsd.org/~attilio/unionfs_nodeget2.patch
>>>>>
>>>>> Unfortunately unionfs code is rather messy in the lookup path about
>>>>> locking requirements so follow what it needs to be done there is a bit
>>>>> difficult.
>>>>> I have no way to test this patch, so it is just test-compiled at the
>>>>> moment, but I would need that you also test lookup path (so directory
>>>>> "ls", find(1) on the whole unionfs volume, etc.) to validate it
>>>>> someway.
>>>> On a second thought, I think that locking in lookup (and also other
>>>> operations) is so fragile and difficult to follow that it makes all
>>>> vnops real locking landmines.
>>>> I think that the following patch fixes the insmntque insertion and
>>>> follows the old approach well enough to be committed separately:
>>>> http://www.freebsd.org/~attilio/unionfs_nodeget3.patch
>>>>
>>> Unfortunately I have no idea about all those locking strategies and
>>> implementations.
>>> Applying unionfs_nodeget3.patch results in:
>>> sys/fs/unionfs/union_subr.c: In function 'unionfs_nodeget':
>>> sys/fs/unionfs/union_subr.c:332: error: expected statement
>>> before ')' token
>>> *** [union_subr.o] Error code 1
>>>
>>> I guess there is a typo in this chunk:
>>> @@ -317,11 +328,11 @@ unionfs_nodeget(struct mount *mp, struct vnode *up
>>>
>>> vref(vp);
>>> } else
>>> *vpp = vp;
>>> -
>>> -unionfs_nodeget_out:
>>> - if (lkflags & LK_TYPE_MASK)
>>> - vn_lock(vp, lkflags | LK_RETRY);
>>> -
>>> + if (lkflags & LK_TYPE_MASK) {
>>> + if (lkflags == LK_SHARED))
>>> ---------------------------------------- ^
>>> + vn_lock(vp, LK_DOWNGRADE | LK_RETRY);
>>> + } else
>>> + VOP_UNLOCK(vp, LK_RELEASE);
>>> return (0);
>>> }
>>>
>>> After removing the second right parenthesis kernel compiles.
>>> But it still crashes:
>>> panic: Lock (lockmgr) ufs not locked @ sys/kern/vfs_default.c:512
>>> cpuid = 1
>>> KDB: stack backtrace:
>>> ...
>>> If you can use the bt info I'll transcribe - no serial console available :-(
>>>
>>> Am I right that I should only apply _one_ unionfs-patchX.patch
>>> (unionfs_nodeget3.patch in that case)?
>> Yes, only that one.
>> Can you please do "bt" from DDB and take a picture of you screen with a camera?
>
> Ok, now I had a reason to take some time finding out how ESXi handles
> serial ports ;-) It's quiet easy and very flexible, so no problem
> setting up a debug console.
> Please find attached the backtrace.
> Do I have to load any symbols? It's not very informative what I see, right?
Hi Harry,
well done.
Can you please backout the prior patch and try this one instead?:
http://www.freebsd.org/~attilio/unionfs_nodeget4.patch
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
More information about the freebsd-stable
mailing list