scheduler (sched_4bsd) questions
Peter Holm
peter at holm.cc
Tue Oct 5 06:03:19 PDT 2004
On Mon, Oct 04, 2004 at 12:42:53PM -0700, Julian Elischer wrote:
OK, I got a crash dump now, after a few modifications to kern_shutdown.c
There are however a few strange things worth noticing:
1) The are no panic string:
Mounted root from ufs:/dev/ad0s1a.
pid 1146: corrected slot count (2->1)
[thread 100796]
Stopped at sched_add+0x13: movl 0x14c(%esi),%ebx
2) The gdb stack trace gets a bit weird at:
#8 0xc07812da in calltrap () at ../../../i386/i386/exception.s:140
#9 0xc05f0018 in flock (td=0x0, uap=0x0) at ../../../kern/kern_descrip.c:2138
#10 0xc0619fd1 in setrunqueue (td=0xc2319180, flags=0x0) at kern_switch.c:521
#11 0xc061921f in sched_wakeup (td=0xc2319180) at ../../../kern/sched_4bsd.c:859
Where did flock() come from?
The full console output is at http://www.holm.cc/stress/log/cons82.html
- Peter
> ok, then if it happens again, from ddb, run
> show ktr
> after you've done the 'ps' and go back a couple of hundred events..
>
> thanks.
>
>
> Peter Holm wrote:
>
> >On Mon, Oct 04, 2004 at 11:57:45AM -0700, Julian Elischer wrote:
> >
> >
> >>can you run ktrdump against teh corefile and get the ktr output?
> >>(you do have it enabled right?)
> >>
> >>
> >>
> >
> >No, that's one of the problems: doadump() fails with this specific panic.
> >
> >- Peter
> >
> >
> >
> >>Peter Holm wrote:
> >>
> >>
> >>
> >>>On Mon, Oct 04, 2004 at 01:34:38PM -0400, Stephan Uphoff wrote:
> >>>
> >>>
> >>>
> >>>
> >>>>On Mon, 2004-10-04 at 11:31, John Baldwin wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>On Friday 01 October 2004 12:13 am, Stephan Uphoff wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>On Wed, 2004-09-29 at 18:14, Stephan Uphoff wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>I was looking at the MUTEX_WAKE_ALL undefined case when I used the
> >>>>>>>critical section for turnstile_claim().
> >>>>>>>However there are bigger problems with MUTEX_WAKE_ALL undefined
> >>>>>>>so you are right - the critical section for turnstile_claim is pretty
> >>>>>>>useless.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>Arghhh !!!
> >>>>>>
> >>>>>>MUTEX_WAKE_ALL is NOT an option in GENERIC.
> >>>>>>I recall verifying that it is defined twice. Guess I must have looked
> >>>>>>at
> >>>>>>the wrong source tree :-(
> >>>>>>This means yes - we have bigger problems!
> >>>>>>
> >>>>>>Example:
> >>>>>>
> >>>>>>Thread A holds a mutex x contested by Thread B and C and has priority
> >>>>>>pri(A).
> >>>>>>
> >>>>>>Thread C holds a mutex y and pri(B) < pri(C)
> >>>>>>
> >>>>>>Thread A releases the lock wakes thread B but lets C on the turnstile
> >>>>>>wait queue.
> >>>>>>
> >>>>>>An interrupt thread I tries to lock mutex y owned by C.
> >>>>>>
> >>>>>>However priority inheritance does not work since B needs to run first
> >>>>>>to
> >>>>>>take ownership of the lock.
> >>>>>>
> >>>>>>I is blocked :-(
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>Ermm, if the interrupt happens after x is released then I's priority
> >>>>>should propagate from I to C to B.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>There is a hole after the mutex x is released by A - but before B can
> >>>>claim the mutex. The turnstile for mutex x is unowned and interrupt
> >>>>thread I when trying to donate its priority will run into:
> >>>>
> >>>> if (td == NULL) {
> >>>> /*
> >>>> * This really isn't quite right. Really
> >>>> * ought to bump priority of thread that
> >>>> * next acquires the lock.
> >>>> */
> >>>> return;
> >>>> }
> >>>>
> >>>>So B needs to run and acquire the mutex before priority inheritance
> >>>>works again and does not get a priority boost to do so.
> >>>>
> >>>>This is easy to fix and MUTEX_WAKE_ALL can be removed again at that time
> >>>>- but my time budget is limited and Peter has an interesting bug left
> >>>>that has priority.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>I'm not closer to being able to create this panic in a controlled way.
> >>>After a whole day of different tests I finally got this panic:
> >>>http://www.holm.cc/stress/log/cons81.html. The trigger seems to be one
> >>>particular Java applet, but it is not easily reproduceable.
> >>>
> >>>- Peter
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>>If the interrupt happens before x is released,
> >>>>>then the final bit of propagate_priority() should handle it since it
> >>>>>resorts the turnstile's thread queue so that C will be awakened rather
> >>>>>than B.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>Agreed.
> >>>>
> >>>> Stephan
> >>>>
> >>>>
> >>>>
> >>>>
> >>>_______________________________________________
> >>>freebsd-arch at freebsd.org mailing list
> >>>http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> >>>To unsubscribe, send any mail to "freebsd-arch-unsubscribe at freebsd.org"
> >>>
> >>>
> >>>
> >>>
> >
> >
> >
--
Peter Holm
-------------- next part --------------
Index: kern_shutdown.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_shutdown.c,v
retrieving revision 1.166
diff -u -r1.166 kern_shutdown.c
--- kern_shutdown.c 2 Sep 2004 18:59:15 -0000 1.166
+++ kern_shutdown.c 5 Oct 2004 12:23:45 -0000
@@ -230,10 +230,14 @@
return;
}
+ if (panicstr == NULL)
+ panicstr = "In doadump()"; /* Major hack XXX pho */
savectx(&dumppcb);
dumptid = curthread->td_tid;
dumping++;
dumpsys(&dumper);
+ if (!strcmp(panicstr, "In doadump()"))
+ panicstr = NULL; /* Major hack XXX pho */
}
/*
@@ -519,6 +523,8 @@
#endif
#ifdef KDB
+ if (panicstr == NULL)
+ panicstr = "(NULL)"; /* XXX pho */
if (newpanic && trace_on_panic)
kdb_backtrace();
if (debugger_on_panic)
More information about the freebsd-arch
mailing list