Re: A kernel crash after compiling a fresh kernel

From: David Wolfskill <david_at_catwhisker.org>
Date: Wed, 08 Jun 2022 03:06:03 UTC
On Tue, Jun 07, 2022 at 09:37:52PM -0400, Oleg Lelchuk wrote:
> The 14-CURRENT running a fresh kernel crashes with these messages:
> 
> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> 55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct
> pcpu,
>                                 (kgdb) #0  __curthread () at
> /usr/src/sys/amd64/include/pcpu_aux.h:55
> #1  dump_savectx () at /usr/src/sys/kern/kern_shutdown.c:401
> #2  0xffffffff80be8ba5 in dumpsys (di=0x0)
>                                 at /usr/src/sys/x86/include/dump.h:87
> #3  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:430
> #4  kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:537
> #5  0xffffffff80be8fee in vpanic (fmt=<optimized out>,
>         ap=ap@entry=0xfffffe01042395a0) at
> /usr/src/sys/kern/kern_shutdown.c:975
> #6  0xffffffff80be8d53 in panic (fmt=<unavailable>)
>                                 at /usr/src/sys/kern/kern_shutdown.c:899
> #7  0xffffffff810c0877 in trap_fatal (frame=0xfffffe0104239690, eva=0)
>                                 at /usr/src/sys/amd64/amd64/trap.c:942
> #8  0xffffffff810c092b in trap_pfault (frame=0xfffffe0104239690,
>         usermode=false, signo=<optimized out>, ucode=<optimized out>)
>                                 at /usr/src/sys/amd64/amd64/trap.c:761
> #9  <signal handler called>
> #10 tcp_sack_output (tp=tp@entry=0xfffffe0187312140,
>         sack_bytes_rexmt=sack_bytes_rexmt@entry=0xfffffe010423983c)
>                                 at /usr/src/sys/netinet/tcp_sack.c:970
> #11 0xffffffff80dd47a2 in tcp_default_output (tp=0xfffffe0187312140)
> at /usr/src/sys/netinet/tcp_output.c:310
> #12 0xffffffff80dcd240 in tcp_output (tp=tp@entry=0xfffffe0187312140)
> at /usr/src/sys/netinet/tcp_var.h:407
> #13 0xffffffff80dcc81a in tcp_do_segment (m=0xfffff80203467700,
>                 th=0xfffff8001f190022, so=0xfffff80015d0fb40,
> tp=0xfffffe0187312140,
>                 drop_hdrlen=64, tlen=<optimized out>, iptos=0 '\000')
> at /usr/src/sys/netinet/tcp_input.c:2788
> #14 0xffffffff80dc8def in tcp_input_with_port (mp=<optimized out>,
>                 offp=<optimized out>, proto=<optimized out>, port=port@entry
> =0)
> at /usr/src/sys/netinet/tcp_input.c:1397
> #15 0xffffffff80dc9c8b in tcp_input (mp=0xfffff8011c764010, offp=0x4,
>                 proto=-2128851986) at /usr/src/sys/netinet/tcp_input.c:1492
> #16 0xffffffff80db8cd7 in ip_input (m=0x0)
> at /usr/src/sys/netinet/ip_input.c:840
> #17 0xffffffff80d3842f in netisr_dispatch_src (proto=1,
>                 source=source@entry=0, m=0xfffff80203467700)
> at /usr/src/sys/net/netisr.c:1153
> #18 0xffffffff80d3878f in netisr_dispatch (proto=477511696,
>                 m=0xffffffff811c4bee) at /usr/src/sys/net/netisr.c:1244
> #19 0xffffffff80d1a7cc in ether_demux (ifp=ifp@entry=0xfffff80002873800,
> m=0x4) at /usr/src/sys/net/if_ethersubr.c:925
> #20 0xffffffff80d1be53 in ether_input_internal (ifp=0xfffff80002873800,
> m=0x4)
> at /usr/src/sys/net/if_ethersubr.c:711
> #21 ether_nh_input (m=<optimized out>) at
> /usr/src/sys/net/if_ethersubr.c:741
> #22 0xffffffff80d3842f in netisr_dispatch_src (proto=proto@entry=5,
>                 source=source@entry=0, m=m@entry=0xfffff80203467700)
> at /usr/src/sys/net/netisr.c:1153
> #23 0xffffffff80d3878f in netisr_dispatch (proto=477511696, proto@entry=5,
>                 m=0xffffffff811c4bee, m@entry=0xfffff80203467700)
> at /usr/src/sys/net/netisr.c:1244
> #24 0xffffffff80d1ac89 in ether_input (ifp=0xfffff80002873800,
>                 m=0xfffff80203467700) at /usr/src/sys/net/if_ethersubr.c:832
> #25 0xffffffff808f41f7 in re_rxeof (sc=sc@entry=0xfffffe00e1cce000,
>                 rx_npktsp=0x0) at /usr/src/sys/dev/re/if_re.c:2386
> #26 0xffffffff808f1a16 in re_intr_msi (xsc=0xfffffe00e1cce000)
> at /usr/src/sys/dev/re/if_re.c:2682
> #27 0xffffffff80ba3d49 in intr_event_execute_handlers
> (ie=0xfffff800020fde00,
>                 p=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1205
> #28 ithread_execute_handlers (ie=0xfffff800020fde00, p=<optimized out>)
> at /usr/src/sys/kern/kern_intr.c:1218
> #29 ithread_loop (arg=arg@entry=0xfffff800020fbf80)
> at /usr/src/sys/kern/kern_intr.c:1306
> #30 0xffffffff80ba03e0 in fork_exit (
>                 callout=0xffffffff80ba3ad0 <ithread_loop>,
> arg=0xfffff800020fbf80,
>                 frame=0xfffffe0104239f40) at
> /usr/src/sys/kern/kern_fork.c:1102
> #31 <signal handler called>

I had a ... vaguely-similar panic on one laptop, after updating sources
from main-n256013-85d7875d4291 to main-n256025-91d6afe6e2a9.  I placed
a screenshot of the backtrace at
https://www.catwhisker.org/~david/FreeBSD/head/n256025/console.jpg

The files updated were:

Updating 85d7875d4291..91d6afe6e2a9
Fast-forward
 contrib/bsddialog/lib/lib_util.c        |  7 +--
 sys/arm64/arm64/identcpu.c              | 85 +++++++++++++++++++++++++++++++++
 sys/arm64/include/armreg.h              | 30 ++++++++++++
 sys/dev/alc/if_alc.c                    |  5 +-
 sys/dev/hwpmc/hwpmc_logging.c           | 21 ++++----
 sys/dev/hwpmc/hwpmc_mod.c               | 17 ++++---
 sys/dev/iommu/iommu_gas.c               | 13 ++++-
 sys/fs/fdescfs/fdesc_vnops.c            | 10 +++-
 sys/kern/subr_smp.c                     |  4 +-
 sys/kern/uipc_usrreq.c                  | 37 ++++++--------
 sys/netinet/tcp_sack.c                  | 12 +++++
 sys/sys/pmc.h                           |  4 ++
 tests/sys/kern/unix_passfd_test.c       | 37 ++++++++++----
 usr.bin/gcore/elfcore.c                 | 29 ++---------
 usr.sbin/bsdinstall/scripts/docsinstall |  3 +-
 15 files changed, 223 insertions(+), 91 deletions(-)

Command exit status: 0

I noted that there was a subsequent commit to sys/netinet/tcp_sack.c
(231e0dd5d1fb7778b1cb285e5ebee5502d5ad253, to avoid a NULL dereference);
while I don't believe I was using SACK, I went ahead and hand-applied
that small change & rebuilt; that did not seem to help (as I got a
similar-looking panic after the rebuild).

Peace,
david
-- 
David H. Wolfskill                              david@catwhisker.org
"Putin is a paranoid dictator.  Putin must go. He started a senseless war
and is leading Russia into a ditch." - Egor Polyakov & Alexandra Miroshnikova

See https://www.catwhisker.org/~david/publickey.gpg for my public key.