Nasty non-recursive lockmgr panic on softdep only enabled UFS
partition when filesystem full
Garrett Cooper
yanegomi at gmail.com
Thu May 5 17:23:52 UTC 2011
On May 4, 2011, at 2:07 AM, Kostik Belousov wrote:
> On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
>> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper <yanegomi at gmail.com> wrote:
>>> On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick <mckusick at mckusick.com> wrote:
>>>>> Date: Tue, 3 May 2011 22:40:26 -0700
>>>>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
>>>>> partition when filesystem full
>>>>> From: Garrett Cooper <yanegomi at gmail.com>
>>>>> To: Jeff Roberson <jeff at freebsd.org>,
>>>>> Marshall Kirk McKusick <mckusick at mckusick.com>
>>>>> Cc: FreeBSD Current <freebsd-current at freebsd.org>
>>>>>
>>>>> Hi Jeff and Dr. McKusick,
>>>>> Ran into this panic when /usr ran out of space doing a make
>>>>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur
>>>>> after the filesystem ran out of space -- wasn't quite sure what it was
>>>>> doing at the time):
>>>>>
>>>>> ...
>>>>>
>>>>> Let me know what other commands you would like for me to run in kgdb.
>>>>> Thanks,
>>>>> -Garrett
>>>>
>>>> You did not indicate whether you are running an 8.X system or a 9-current
>>>> system. It would be helpful to know that.
>>>
>>> I've actually been running CURRENT for a few years now, but you're right --
>>> I didn't mention that part.
>>>
>>>> Jeff thinks that there may be a potential race in the locking code for
>>>> softdep_request_cleanup. If so, this patch for 9-current should fix it:
>>>>
>>>> Index: ffs_softdep.c
>>>> ===================================================================
>>>> --- ffs_softdep.c (revision 221385)
>>>> +++ ffs_softdep.c (working copy)
>>>> @@ -11380,7 +11380,8 @@
>>>> continue;
>>>> }
>>>> MNT_IUNLOCK(mp);
>>>> - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
>>>> + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK,
>>>> + curthread)) {
>>>> MNT_ILOCK(mp);
>>>> continue;
>>>> }
>>>>
>>>> If you are running an 8.X system, hopefully you will be able to apply it.
>>>
>>> I've applied it, rebuilt and installed the kernel, and trying to
>>> repro the case again. Will let you know how things go!
>>
>> Happened again with the change. It's really easy to repro:
>>
>> 1. Get a filesystem with UFS+SU
>> 2. Execute something that does a large number of small writes to a partition.
>> 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition
>>
>> The kernel will panic with the issue I discussed above.
>> Thanks!
>
> Jeff' change is required to avoid LORs, but it is not sufficient to
> prevent recursion. We must skip the vnode supplied as a parameter to
> softdep_request_cleanup(). Theoretically, other vnodes might be also
> locked by curthread, thus I think the change below is needed. Try this.
>
> diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
> index a6d4441..25fa5d6 100644
> --- a/sys/ufs/ffs/ffs_softdep.c
> +++ b/sys/ufs/ffs/ffs_softdep.c
> @@ -11380,7 +11380,9 @@ retry:
> continue;
> }
> MNT_IUNLOCK(mp);
> - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
> + if (VOP_ISLOCKED(lvp) ||
> + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT,
> + curthread)) {
> MNT_ILOCK(mp);
> continue;
> }
Ran into the same panic after I applied the patch above with the repro steps I described before. One thing that I noticed is that the issue isn't as easy to reproduce unless you add the dd in parallel with the make operation.
Thanks,
-Garrett
More information about the freebsd-current
mailing list