Re: Hang ast / pipelk / piperd

From: Paul Floyd <paulf2718_at_gmail.com>
Date: Wed, 01 Jun 2022 21:27:52 UTC

On 6/1/22 14:16, Mark Johnston wrote:
> On Mon, May 30, 2022 at 09:35:05PM +0200, Paul Floyd wrote:
>>
>>
>> On 5/30/22 14:15, Mark Johnston wrote:
>>
>>> "procstat -kk <valgrind PID>" might help to reveal what's going on,
>>> since it sounds like the hand/livelock is happening somewhere in the
>>> kernel.
>>
>> Not knowing much about the kernel, my guess is that this is related to
>>
>> commit 4808bab7fa6c3ec49b49476b8326d7a0250a03fa
>> Author: Alexander Motin <mav@FreeBSD.org>
>> Date:   Tue Sep 21 18:14:22 2021 -0400
>>
>>       sched_ule(4): Improve long-term load balancer.
>>
>> and this bit of ast code
>>
>> doreti_ast:
>> 	/*
>> 	 * Check for ASTs atomically with returning.  Disabling CPU
>> 	 * interrupts provides sufficient locking even in the SMP case,
>> 	 * since we will be informed of any new ASTs by an IPI.
>> 	 */
>> 	cli
>> 	movq	PCPU(CURTHREAD),%rax
>> 	testl	$TDF_ASTPENDING | TDF_NEEDRESCHED,TD_FLAGS(%rax)
>> 	je	doreti_exit
>> 	sti
>> 	movq	%rsp,%rdi	/* pass a pointer to the trapframe */
>> 	call	ast
>> 	jmp	doreti_ast
>>
>>
>> The above commit seems to be migrating loaded threads to another CPU.
> 
> How did you infer that?  The long-term load balancer should be running
> fairly infrequently.
> 


Well one thread is hung in
mi_switch+0xc2 ast+0x1e6 doreti_ast+0x1f

I found the above code referring to doreti_ast with grep and thought 
that if the test always fails then it will loop forever. So I looked for 
git commits containing TDF_ASTPENDING and TDF_NEEDRESCHED and found a 
couple of commits mentioning these flags, scheduling and found the above 
and the followup

commit 11f14b33629e552a451fdbfe653ebb0addd27700
Author: Alexander Motin <mav@FreeBSD.org>
Date:   Sun Sep 26 12:03:05 2021 -0400

     sched_ule(4): Fix hang with steal_thresh < 2.

     e745d729be60 caused infinite loop with interrupts disabled in load
     stealing code if steal_thresh set below 2.  Such configuration should
     not generally be used, but appeared some people are using it to
     workaround some problems.

and I guessed they might be related.

> As a side note, I think we are missing ktrcsw() calls in some places,
> e.g., in turnstile_wait().
> 
>> My test system is a VirtualBox amd64 FreeBSD 13.1 with one CPU running
>> on a 13.0 host.
>>
>> I just tried restarting the VM with 2 CPUs and the testcase seems to be
>> a lot better - it's been running in a loop for 10 minutes whereas
>> previously it would hang at least 1 in 5 times.
> 
> Hmm.  Could you, please, show the ktrace output with -H -T passed to
> kdump(1), together with fresh "procstat -kk" output?
> 
> The fact that the problem apparently only occurs with 1 CPU suggests a
> scheduler bug, indeed.


Here is the fresh procstat output

paulf@freebsd:~ $ procstat -kk 12339
   PID    TID COMM                TDNAME              KSTACK 

12339 100089 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
umtxq_sleep+0x143 do_wait+0x3e5 __umtx_op_wait+0x53 sys__umtx_op+0x7e 
amd64_syscall+0x10c fast_syscall_common+0xf8

12339 100582 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_timedwait_sig+0x12 _sleep+0x1d1 
kern_clock_nanosleep+0x1c1 sys_nanosleep+0x3b amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100583 none-amd64-freebsd  -                   mi_switch+0xc2 
ast+0x1e6 doreti_ast+0x1f

12339 100584 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100585 none-amd64-freebsd  -                   mi_switch+0xc2 
intr_event_handle+0x167 intr_execute_handlers+0x4b Xapic_isr1+0xdc 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100586 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100587 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100588 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100589 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100590 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100591 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0x3d6 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100592 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0xb3 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100593 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0xb3 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100594 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0xb3 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8

12339 100595 none-amd64-freebsd  -                   mi_switch+0xc2 
sleepq_catch_signals+0x2e6 sleepq_wait_sig+0x9 _sleep+0x1f2 
pipe_read+0xb3 dofileread+0x81 sys_read+0xbc amd64_syscall+0x10c 
fast_syscall_common+0xf8


And the ktrace

  12339 100591 none-amd64-freebsd 1654117859.896834 RET   nanosleep 0
  12339 100591 none-amd64-freebsd 1654117859.896839 CALL 
sigprocmask(SIG_SETMASK,0x404563db0,0)
  12339 100591 none-amd64-freebsd 1654117859.896840 RET   sigprocmask 0
  12339 100591 none-amd64-freebsd 1654117859.896842 CALL 
thr_self(0x404563ca8)
  12339 100591 none-amd64-freebsd 1654117859.896844 RET   thr_self 0
  12339 100591 none-amd64-freebsd 1654117859.896845 CALL 
read(0x6f57,0x404563cfe,0x1)
  12339 100591 none-amd64-freebsd 1654117859.896847 CSW   stop kernel 
"pipelk"
  12339 100583 none-amd64-freebsd 1654117859.896849 GIO   fd 28503 read 
1 byte
        "T"
  12339 100583 none-amd64-freebsd 1654117859.896852 RET   read 1
  12339 100583 none-amd64-freebsd 1654117859.896853 CSW   stop user "ast"
  12339 100590 none-amd64-freebsd 1654117859.896854 CSW   resume kernel 
"pipelk"
  12339 100590 none-amd64-freebsd 1654117859.896854 CSW   stop kernel 
"piperd"
  12339 100589 none-amd64-freebsd 1654117859.896855 CSW   resume kernel 
"pipelk"
  12339 100589 none-amd64-freebsd 1654117859.896855 CSW   stop kernel 
"piperd"
  12339 100588 none-amd64-freebsd 1654117859.896856 CSW   resume kernel 
"pipelk"
  12339 100588 none-amd64-freebsd 1654117859.896856 CSW   stop kernel 
"piperd"

[REPEATS, TRIMMMED]

  12339 100590 none-amd64-freebsd 1654117859.896896 CSW   resume kernel 
"piperd"
  12339 100590 none-amd64-freebsd 1654117859.896897 CSW   stop kernel 
"piperd"
  12339 100089 none-amd64-freebsd Events dropped.
  12339 100089 none-amd64-freebsd 1654118042.172196 CALL 
thr_self(0x40268d7f8)
  12339 100089 none-amd64-freebsd 1654118042.172198 RET   thr_self 0
  12339 100089 none-amd64-freebsd 1654118042.172202 CALL 
thr_self(0x40268d788)
  12339 100089 none-amd64-freebsd 1654118042.172204 RET   thr_self 0
  12339 100089 none-amd64-freebsd 1654118042.172206 CALL 
read(0x6f57,0x40268d7de,0x1)
  12339 100584 none-amd64-freebsd 1654118042.172217 CSW   stop kernel 
"piperd"
  12339 100089 none-amd64-freebsd 1654118042.172221 CSW   stop kernel 
"piperd"
  12339 100584 none-amd64-freebsd 1654118042.172222 CSW   resume kernel 
"piperd"
  12339 100584 none-amd64-freebsd 1654118042.172222 CSW   stop kernel 
"piperd"

[REPEATS, TRIMMMED]

  12339 100089 none-amd64-freebsd 1654118042.172259 CSW   stop kernel 
"piperd"
  12339 100584 none-amd64-freebsd 1654118042.172260 CSW   resume kernel 
"piperd"
  12339 100089 none-amd64-freebsd Events dropped.
  12337 100093 sh       1654117857.635559 CSW   stop kernel "wait"
  12337 100093 sh       1654118057.965243 CSW   resume kernel "wait"
  12337 100093 sh       1654118057.965258 RET   wait4 12339/0x3033
  12337 100093 sh       1654118057.965356 CALL 
fstatat(AT_FDCWD,0x7fffffffe0c0,0x7fffffffe4c0,0)
  12337 100093 sh       1654118057.965359 NAMI 
"/usr/share/nls/C.UTF-8/libc.cat"
  12337 100093 sh       1654118057.965363 RET   fstatat -1 errno 2 No 
such file or directory
  12337 100093 sh       1654118057.965367 CALL 
fstatat(AT_FDCWD,0x7fffffffe0c0,0x7fffffffe4c0,0)
  12337 100093 sh       1654118057.965369 NAMI 
"/usr/share/nls/libc/C.UTF-8"
  12337 100093 sh       1654118057.965371 RET   fstatat -1 errno 2 No 
such file or directory
  12337 100093 sh       1654118057.965373 CALL 
fstatat(AT_FDCWD,0x7fffffffe0c0,0x7fffffffe4c0,0)
  12337 100093 sh       1654118057.965375 NAMI 
"/usr/local/share/nls/C.UTF-8/libc.cat"
  12337 100093 sh       1654118057.965377 RET   fstatat -1 errno 2 No 
such file or directory
  12337 100093 sh       1654118057.965380 CALL 
fstatat(AT_FDCWD,0x7fffffffe0c0,0x7fffffffe4c0,0)
  12337 100093 sh       1654118057.965381 NAMI 
"/usr/local/share/nls/libc/C.UTF-8"
  12337 100093 sh       1654118057.965383 RET   fstatat -1 errno 2 No 
such file or directory
  12337 100093 sh       1654118057.965410 CALL  write(0x2,0x80184c000,0x7)
  12337 100093 sh       1654118057.965417 GIO   fd 2 wrote 7 bytes
        "Killed
        "

A+
Paul