svn commit: r303019 - head/sys/geom
Warner Losh
imp at bsdimp.com
Fri Aug 12 15:21:06 UTC 2016
On Fri, Aug 12, 2016 at 9:17 AM, Kenneth D. Merry <ken at freebsd.org> wrote:
> On Fri, Aug 12, 2016 at 09:13:58 -0600, Warner Losh wrote:
>> On Fri, Aug 12, 2016 at 9:11 AM, Kenneth D. Merry <ken at freebsd.org> wrote:
>> > On Fri, Aug 12, 2016 at 13:38:21 +0300, Andrey V. Elsukov wrote:
>> >> On 12.08.16 03:26, Bryan Drewery wrote:
>> >> > On r303467 I ran into this:
>> >> >
>> >> > panic @ time 1470916206.652, thread 0xfffff8000412f000:
>> >> > g_resize_provider_event but withered
>> >> > cpuid = 0
>> >> > Panic occurred in module kernel loaded at 0xffffffff80200000:
>> >> >
>> >> > Stack: --------------------------------------------------
>> >> > kernel:kassert_panic+0x166
>> >> > kernel:g_resize_provider_event+0x181
>> >> > kernel:g_run_events+0x186^M^M
>> >> > kernel:fork_exit+0x83^M^M
>> >> > --------------------------------------------------
>> >> >
>> >> > No further information available unfortunately.
>> >>
>> >> This one is related to r302087 :)
>> >
>> > It looks like there is a race. I think we need to replace the KASSERT
>> > in g_resize_provider_event() with a return in case the provider is
>> > withered.
>> >
>> > I won't be able to work on or test this until sometime next week. So if
>> > you guys want to go ahead and make the change, please do.
>>
>> But why are we calling g_resize_provider on a withered object? That's
>> the part I don't understand in this thread.
>
> It isn't withered when the event is queued, but it is withered by the time
> the event is executed.
>
> There is a check in g_resize_provider() to make sure it isn't withered. If
> not, the event is queued. But once g_resize_provider_event() runs, it is
> withered and we run into the KASSERT.
>
> There isn't adequate locking and ordering in there to prevent the race
> from happening, so the assert should be replaced with an "if (withered)
> return" statement.
I'll grant that we may wither with outstanding events, but why is it
withering? That seems odd. Either we're bogusly posting this event
just before it will wither, or something else is bogusly withering it.
Just removing the assert isn't going to fix the underlying issue.
Back to Bryan: just to be clear, this is with the latest version of
the code, and not the intermediate version that was fixed after
numerous problems surfaced, right?
Warner
More information about the svn-src-head
mailing list