debugging frequent kernel panics on 8.2-RELEASE
Andriy Gapon
avg at FreeBSD.org
Thu Aug 18 09:21:21 UTC 2011
on 18/08/2011 02:15 Steven Hartland said the following:
> ----- Original Message ----- From: "Andriy Gapon" <avg at FreeBSD.org>
>
>> Thanks to the debug that Steven provided and to the help that I received from
>> Kostik, I think that now I understand the basic mechanics of this panic, but,
>> unfortunately, not the details of its root cause.
>>
>> It seems like everything starts with some kind of a race between terminating
>> processes in a jail and termination of the jail itself. This is where the
>> details are very thin so far. What we see is that a process (http) is in
>> exit(2) syscall, in exit1() function actually, and past the place where P_WEXIT
>> flag is set and even past the place where p_limit is freed and reset to NULL.
>> At that place the thread calls prison_proc_free(), which calls prison_deref().
>> Then, we see that in prison_deref() the thread gets a page fault because of what
>> seems like a NULL pointer dereference. That's just the start of the problem and
>> its root cause.
>
> Thats interesting, are you using http as an example or is that something thats
> been gleaned from the debugging of our output? I ask as there's only one process
> running in each of our jails and thats a single java process.
It's from the debug data: p_comm = "httpd"
I also would like to ask you to revert the last patch that I sent you (with tf_rip
comparisons) and try the patch from Kostik instead.
Given what we suspect about the problem, can please also try to provoke the
problem by e.g. doing frequent jail restarts or something else that supposedly
should hit the bug.
--
Andriy Gapon
More information about the freebsd-stable
mailing list