The FreeBSD code structure that sometimes break powerpc64's pmac_thermal, buf*daemon* threads, etc. (stuck sleeping)
Mark Millard
marklmi at yahoo.com
Tue Feb 26 20:25:10 UTC 2019
[In one aspect of my comments I seem to have misapplied
some generic background information: cross CPU sbinuptime()
values are not involved for timercb vs. sleepq_timeout
in the call chain. It sometimes goes backwards on the
same CPU during the call chain.]
On 2019-Feb-26, at 04:29, Mark Millard <marklmi at yahoo.com> wrote:
> [head -r334018 based. I temporarily changed what messaging
> happens for KTR_PROC in order to gather information. As stands
> it was reporting on pmac_thermal because I can listen for the
> fans to know the stuck-sleeping has happened recently.]
>
> A powerpc64 sleepq_timeout callout usage has the structure:
>
> 0xe00000009af7c730: at sleepq_timeout+0x148
> 0xe00000009af7c7d0: at softclock_call_cc+0x234
> 0xe00000009af7c910: at callout_process+0x2e0
> 0xe00000009af7c9f0: at handleevents+0x22c
> 0xe00000009af7caa0: at timercb+0x340
> 0xe00000009af7cba0: at decr_intr+0x140
> 0xe00000009af7cbd0: at powerpc_interrupt+0x268
>
> timercb does:
>
> now = sbinuptime();
> . . .
> handleevents(now, 0);
>
> That in turn leads to:
>
> if (now >= state->nextcallopt || now >= state->nextcall) {
> state->nextcall = state->nextcallopt = SBT_MAX;
> callout_process(now);
> }
>
> Which leads to:
>
> if (tmp->c_time <= now) {
> . . .
> if (tmp->c_iflags & CALLOUT_DIRECT) {
> . . .
> softclock_call_cc(tmp, cc,
> #ifdef CALLOUT_PROFILING
> &mpcalls_dir, &lockcalls_dir, NULL,
> #endif
> 1);
>
> So softclock_call_cc and sleepq_timeout do not get a copy
> of timercb 's now value. (I will refer to this now value
> in contexts for which it is not accessible.)
>
> sleepq_timeout uses:
>
> td->td_sleeptimo > sbinuptime()
>
> to indicate to not do something like:
>
> sleepq_resume_thread(. . .)
>
> but also does not use sleepq_set_timeout_sbt to
> set up another sleepq_timeout callout or do some
> other such under this condition. (So a what is
> effectively a no-op ends up the last activity
> before the thread is stuck asleep.)
>
> (I will continue to write sbinuptime() to reference
> the value used in that test.)
>
> With multiple processors, it is observed that,
> despite sbinuptime() being physically later:
>
> sbinuptime() < now
>
> does sometimes happen for the code involved,
> for example:
>
> sbinuptime(): 0x11abf13bd142
> now : 0x11accb5df419
>
> This is even though:
>
> tmp->c_time == td->td_sleeptimo
>
> is observed. sbinuptime() < now sometimes leads
> to:
>
> sbinuptime() < tmp->c_time == td->td_sleeptimo <= now
>
> for example the following happened:
>
> sbinuptime(): 0x11abf13bd142
> tmp->c_time : 0x11ac2096af71 [== td->td_sleeptimo]
> now : 0x11accb5df419
>
> As I understand, keeping the various
> powerpc64 CPUs' sbinuptime() values fully
> synchronized is unlikely so the problem would
> still exist even if "closer but not identical"
> across CPUs was achieved.
The comments about various CPUs having somewhat
mismatched sbinuptime() values does not seem to
apply to timercb vs. softclock_call_cc in:
0xe00000009af7c7d0: at softclock_call_cc+0x234
0xe00000009af7c910: at callout_process+0x2e0
0xe00000009af7c9f0: at handleevents+0x22c
0xe00000009af7caa0: at timercb+0x340
But that means that sbinuptime() values are
going backwards on the same CPU during that
call chain's activity! Ouch.
> [The testing context here is an old PowerMac G5
> 4-core (system total), which actually involves
> 2 sockets, 2 cores per socket.]
>
> Overall this structure does not seemed to be
> designed to handle variations in sbinuptime()
> values across CPUs.
>
>
> The sleepq_timeout source code (before my
> limited modifications for reporting more):
>
> static void
> sleepq_timeout(void *arg)
> {
> struct sleepqueue_chain *sc __unused;
> struct sleepqueue *sq;
> struct thread *td;
> void *wchan;
> int wakeup_swapper;
>
> td = arg;
> wakeup_swapper = 0;
> CTR3(KTR_PROC, "sleepq_timeout: thread %p (pid %ld, %s)",
> (void *)td, (long)td->td_proc->p_pid, (void *)td->td_name);
>
> thread_lock(td);
>
> if (td->td_sleeptimo > sbinuptime() || td->td_sleeptimo == 0) {
> /*
> * The thread does not want a timeout (yet).
> */
> } else if (TD_IS_SLEEPING(td) && TD_ON_SLEEPQ(td)) {
> /*
> * See if the thread is asleep and get the wait
> * channel if it is.
> */
> wchan = td->td_wchan;
> sc = SC_LOOKUP(wchan);
> THREAD_LOCKPTR_ASSERT(td, &sc->sc_lock);
> sq = sleepq_lookup(wchan);
> MPASS(sq != NULL);
> td->td_flags |= TDF_TIMEOUT;
> wakeup_swapper = sleepq_resume_thread(sq, td, 0);
> } else if (TD_ON_SLEEPQ(td)) {
> /*
> * If the thread is on the SLEEPQ but isn't sleeping
> * yet, it can either be on another CPU in between
> * sleepq_add() and one of the sleepq_*wait*()
> * routines or it can be in sleepq_catch_signals().
> */
> td->td_flags |= TDF_TIMEOUT;
> }
>
> thread_unlock(td);
> if (wakeup_swapper)
> kick_proc0();
> }
>
>
> I wonder if just not having:
>
> td->td_sleeptimo > sbinuptime() ||
>
> in the if (. . .) is appropriate for the powerpc64 context in
> use: presume callout_process 's tmp->c_time <= now is
> sufficient? (Vs: race issues?)
>
>
>
> Adjusted KTR_PROC output example ("show ktr /v" in ddb
> captured, so newest to oldest order, flags and such are
> in hex despite my not providing 0x prefixes):
>
> 922 (0xc00000000d12f000:cpu3) 151713214399 /usr/src/sys/kern/subr_sleepqueue.c.1027: sleepq_timeout thread not want timeout yet: thread 0xc00000000a8315a0 (pid 26, pmac_thermal) td_sleeptimo=11ac2096af71 sbinuptime=11abf13bd142
> 921 (0xc00000000d12f000:cpu3) 151713214374 /usr/src/sys/kern/subr_sleepqueue.c.1016: sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 920 (0xc00000000d12f000:cpu3) 151713214337 /usr/src/sys/kern/kern_timeout.c.503: callout_process to call softclock_clock_cc: thread 0xc00000000a8315a0 c_time=11ac2096af71 now=11accb5df419
> 919 (0xc00000000a8315a0:cpu3) 151686051357 /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 918 (0xc00000000a8315a0:cpu3) 151686051310 /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: sbt=fffffed8 pr=0 flags=100 td_sleeptimo=11ac2096af71 prec=fffffed flags=501
> 917 (0xc00000000a8315a0:cpu3) 151686051306 /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 916 (0xc00000000a8315a0:cpu3) 151686051267 /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, pmac_thermal) on pmac_therm (0xc0000000015082c3)
> 915 (0xc00000000a8315a0:cpu3) 151686051162 /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 914 (0xc00000000a8315a0:cpu3) 151686051144 /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 913 (0xc000000001f6e000:cpu0) 151686050654 /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 912 (0xc000000001f6e000:cpu0) 151686050634 /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 911 (0xc00000000a8315a0:cpu3) 151684339557 /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 910 (0xc00000000a8315a0:cpu3) 151684339525 /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: sbt=cccccbe0 pr=0 flags=100 td_sleeptimo=11abe0139d4d prec=cccccbe flags=501
> 909 (0xc00000000a8315a0:cpu3) 151684339517 /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 908 (0xc00000000a8315a0:cpu3) 151684339498 /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, pmac_thermal) on smu (0xe000000087fd1670)
> 907 (0xc00000000a8315a0:cpu3) 151684339326 /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 906 (0xc00000000a8315a0:cpu3) 151684339313 /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 905 (0xc000000001f6e000:cpu3) 151684338924 /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 904 (0xc000000001f6e000:cpu3) 151684338918 /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 903 (0xc00000000a8315a0:cpu3) 151682628069 /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 902 (0xc00000000a8315a0:cpu3) 151682628004 /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: sbt=cccccbe0 pr=0 flags=100 td_sleeptimo=11abd3054758 prec=cccccbe flags=501
> 901 (0xc00000000a8315a0:cpu3) 151682628000 /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 900 (0xc00000000a8315a0:cpu3) 151682627960 /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, pmac_thermal) on smu (0xe000000087fd1670)
> 899 (0xc00000000a8315a0:cpu3) 151682627706 /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 898 (0xc00000000a8315a0:cpu3) 151682627683 /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 897 (0xc000000001f6e000:cpu0) 151682627254 /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 896 (0xc000000001f6e000:cpu0) 151682627242 /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 895 (0xc00000000a8315a0:cpu0) 151680915222 /usr/src/sys/kern/kern_synch.c.435: mi_switch: old thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 894 (0xc00000000a8315a0:cpu0) 151680915139 /usr/src/sys/kern/subr_sleepqueue.c.409: sleepq_set_timeout_sbt: sbt=cccccbe0 pr=0 flags=100 td_sleeptimo=11abc5f6f163 prec=cccccbe flags=501
> 893 (0xc00000000a8315a0:cpu0) 151680915133 /usr/src/sys/kern/subr_sleepqueue.c.407: sleepq_set_timeout_sbt installing sleepq_timeout: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 892 (0xc00000000a8315a0:cpu0) 151680915085 /usr/src/sys/kern/kern_synch.c.180: sleep: thread 100078 (pid 26, pmac_thermal) on smu (0xe000000087fd1670)
> 891 (0xc00000000a8315a0:cpu0) 151680914811 /usr/src/sys/kern/subr_sleepqueue.c.631: sleepq resume: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 890 (0xc00000000a8315a0:cpu0) 151680914784 /usr/src/sys/kern/kern_synch.c.445: mi_switch: new thread 100078 (td_sched 0xc00000000a831ad8, pid 26, pmac_thermal)
> 889 (0xc000000001f6e000:cpu2) 151680914304 /usr/src/sys/kern/subr_sleepqueue.c.837: sleepq_wakeup: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> 888 (0xc000000001f6e000:cpu2) 151680914284 /usr/src/sys/kern/subr_sleepqueue.c.989: sleepq_remove_matching calling sleepq_resume_thread: thread 0xc00000000a8315a0 (pid 26, pmac_thermal)
> . . .
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-ppc
mailing list