Re: kernel core debugging

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 06 Nov 2024 19:19:31 UTC
On Nov 6, 2024, at 10:07, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
> 
>> On 6. Nov 2024, at 17:51, Mark Millard <marklmi@yahoo.com> wrote:
>> 
>> 
>> 
>> On Nov 6, 2024, at 09:28, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>> 
>>>> On 6. Nov 2024, at 15:12, Mark Millard <marklmi@yahoo.com> wrote:
>>>> 
>>>> On Nov 6, 2024, at 01:44, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>>>> 
>>>>> is debugging a kernel panic by using kgdb or lldb on a core file
>>>>> supposed to work? At least it is not right now for me...
>>>> 
>>>> # kgdb /boot/kernel.GENERIC-NODEBUG/kernel /var/crash/vmcore.2
>>>> GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD]
>>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.
>>>> Type "show copying" and "show warranty" for details.
>>>> This GDB was configured as "aarch64-portbld-freebsd15.0".
>>>> Type "show configuration" for configuration details.
>>>> For bug reporting instructions, please see:
>>>> <https://www.gnu.org/software/gdb/bugs/>.
>>>> Find the GDB manual and other documentation resources online at:
>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>> 
>>>> For help, type "help".
>>>> Type "apropos word" to search for commands related to "word"...
>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/kernel...
>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/kernel.debug...
>>>> 
>>>> Unread portion of the kernel message buffer:
>>>> KDB: enter: manual escape to debugger
>>>> 
>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/uhid.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/uhid.ko.debug...
>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/wmt.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/wmt.ko.debug...
>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/ums.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/ums.ko.debug...
>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/zfs.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/zfs.ko.debug...
>>>> 0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
>>>> warning: 404 /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c: No such file or directory
>>>> (kgdb) bt
>>>> #0  0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
>>>> #1  0xffff0000000ee6a8 in db_dump (dummy=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>) at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:596
>>>> #2  0xffff0000000ee478 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=true) at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:508
>>>> #3  0xffff0000000ee150 in db_command_loop () at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:555
>>>> #4  0xffff0000000f1ff4 in db_trap (type=<optimized out>, code=<optimized out>) at /home/pkgbuild/worktrees/main/sys/ddb/db_main.c:267
>>>> #5  0xffff000000568b0c in kdb_trap (type=60, code=0, tf=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/subr_kdb.c:790
>>>> #6  <signal handler called>
>>>> #7  kdb_enter (why=<optimized out>, msg=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/subr_kdb.c:556
>>>> #8  0xffff0000003625cc in vt_machine_kbdevent (vd=<optimized out>, c=<optimized out>) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:761
>>>> #9  vt_processkey (kbd=0xffffa000803caa80, vd=0xffff000000d24360 <vt_consdev>, c=-2147483514) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:903
>>>> #10 vt_kbdevent (kbd=0xffffa000803caa80, event=<optimized out>, arg=0xffff000000d24360 <vt_consdev>) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:1030
>>>> #11 0xffff0000001ea048 in kbdmux_intr (kbd=0xffffa000803caa80, arg=<optimized out>) at /home/pkgbuild/worktrees/main/sys/dev/kbdmux/kbdmux.c:565
>>>> #12 0xffff0000005839ac in taskqueue_run_locked (queue=queue@entry=0xffffa000803c9c00) at /home/pkgbuild/worktrees/main/sys/kern/subr_taskqueue.c:517
>>>> #13 0xffff000000583714 in taskqueue_run (queue=0xffffa000803c9c00) at /home/pkgbuild/worktrees/main/sys/kern/subr_taskqueue.c:532
>>>> #14 0xffff0000004bc114 in intr_event_execute_handlers (ie=0xffffa0008028ec00, p=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1183
>>>> #15 ithread_execute_handlers (ie=0xffffa0008028ec00, p=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1196
>>>> #16 ithread_loop (arg=<optimized out>, arg@entry=0xffffa000803de5a0) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1289
>>>> #17 0xffff0000004b700c in fork_exit (callout=0xffff0000004bbd78 <ithread_loop>, arg=0xffffa000803de5a0, frame=0xffff00010ca97a00) at /home/pkgbuild/worktrees/main/sys/kern/kern_fork.c:1151
>>>> #18 <signal handler called>
>>>> 
>>>> The context here was from an official PkgBase kernel and world
>>>> installation.
>>>> 
>>>> . . . (deletion) . . .
>>>> 
>>>> You may have to be more explicit about the specific of the
>>>> problem(s) you are seeing.
>>> OK. Here is what I am referring to:
>>> 
>>> tuexen@head:~ % sudo kgdb -c /var/crash/vmcore.last /boot/kernel/kernel
>> 
>> That command does not match the parameter order in the man page or
>> in my example.
>> 
>> man kgdb output shows kernel first, then core:
>> 
>> SYNOPSIS
>>   kgdb [-a | -f | -fullname] [-b rate] [-q | -quiet] [-v] [-w]
>>        [-d crashdir] [-c core | -n dumpnr | -r device] [kernel [core]]
>> 
>> My example: # kgdb /boot/kernel.GENERIC-NODEBUG/kernel /var/crash/vmcore.2
>> 
>> You might want to see if using the other order makes a difference.
> No it doesn't. I'm specifying the core via the -c core option instead of 
> the second argument...

Of course. Sorry for the noise. (Does not look like this is
going to be one of my better mornings.)

I guess about all we learn is that your issue is somehow more
specific to your context rather than it being an example of
kgdb being generally broken.

> Best regards
> Michael
>> 
>>> Password:
>>> GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD]
>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.
>>> Type "show copying" and "show warranty" for details.
>>> This GDB was configured as "aarch64-portbld-freebsd15.0".
>>> Type "show configuration" for configuration details.
>>> For bug reporting instructions, please see:
>>> <https://www.gnu.org/software/gdb/bugs/>.
>>> Find the GDB manual and other documentation resources online at:
>>> <http://www.gnu.org/software/gdb/documentation/>.
>>> 
>>> For help, type "help".
>>> Type "apropos word" to search for commands related to "word"...
>>> Reading symbols from /boot/kernel/kernel...
>>> Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
>>> 
>>> Unread portion of the kernel message buffer:
>>> panic: tcp_do_segment: sent too much
>>> cpuid = 1
>>> time = 1730910226
>>> KDB: stack backtrace:
>>> db_trace_self() at db_trace_self
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x38
>>> vpanic() at vpanic+0x1a0
>>> panic() at panic+0x48
>>> tcp_do_segment() at tcp_do_segment+0x2794
>>> tcp_input_with_port() at tcp_input_with_port+0xcbc
>>> tcp_input() at tcp_input+0x10
>>> ip_input() at ip_input+0x35c
>>> netisr_dispatch_src() at netisr_dispatch_src+0xd8
>>> tunwrite() at tunwrite+0x2a8
>>> devfs_write_f() at devfs_write_f+0x108
>>> dofilewrite() at dofilewrite+0x7c
>>> kern_writev() at kern_writev+0x4c
>>> sys_writev() at sys_writev+0x40
>>> do_el0_sync() at do_el0_sync+0x60c
>>> handle_el0_sync() at handle_el0_sync+0x4c
>>> --- exception, esr 0x56000000
>>> KDB: enter: panic
>>> 
>>> Reading symbols from /boot/kernel/tcp_rack.ko...
>>> Reading symbols from /usr/lib/debug//boot/kernel/tcp_rack.ko.debug...
>>> Reading symbols from /boot/kernel/tcphpts.ko...
>>> Reading symbols from /usr/lib/debug//boot/kernel/tcphpts.ko.debug...
>>> Reading symbols from /boot/kernel/if_bridge.ko...
>>> Reading symbols from /usr/lib/debug//boot/kernel/if_bridge.ko.debug...
>>> Reading symbols from /boot/kernel/bridgestp.ko...
>>> Reading symbols from /usr/lib/debug//boot/kernel/bridgestp.ko.debug...
>>> Reading symbols from /boot/kernel/uhid.ko...
>>> Reading symbols from /usr/lib/debug//boot/kernel/uhid.ko.debug...
>>> Reading symbols from /boot/kernel/wmt.ko...
>>> Reading symbols from /usr/lib/debug//boot/kernel/wmt.ko.debug...
>>> Reading symbols from /boot/kernel/cc_newreno.ko...
>>> Reading symbols from /usr/lib/debug//boot/kernel/cc_newreno.ko.debug...
>>> 0xffff0000004b5644 in doadump (textdump=0) at /usr/home/tuexen/freebsd-src/sys/kern/kern_shutdown.c:404
>>> 404 dump_savectx();
>>> (kgdb) up
>>> #1  0x3fdb0000000e99f0 in ?? ()

Somehow it went from referencing the apparently correct/expected
0xffff0000004b5644 (doadump) to referencing 0x3fdb0000000e99f0 .
There is also the odd "404 dump_savectx();".

If bt is used as the first command at the prompt, what does it
show for the backtrace? Anything interesting? Just #0 for
0xffff0000004b5644 and #1 for 0x3fdb0000000e99f0 ?


My doadump line also has ", textdump@entry=212431136":

0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404

But I'm not aware of the PkbBase build configuration information
being published for making comparisons with.


>>> (kgdb)  Initial frame selected; you cannot go up.
>>> (kgdb)  Initial frame selected; you cannot go up.
>>> (kgdb) quit
>>> tuexen@head:~ % pkg info gdb  gdb-15.1
>>> Name           : gdb
>>> Version        : 15.1
>>> Installed on   : Thu Oct 24 10:32:17 2024 CEST
>>> Origin         : devel/gdb
>>> Architecture   : FreeBSD:15:aarch64
>>> Prefix         : /usr/local
>>> Categories     : devel
>>> Licenses       : GPLv3
>>> Maintainer     : pizzamig@FreeBSD.org
>>> WWW            : https://www.gnu.org/software/gdb/
>>> Comment        : GNU Project Debugger
>>> Options        :
>>> BUNDLED_READLINE: off
>>> BUNDLED_ZLIB   : off
>>> DEBUGINFOD     : off
>>> GDB_LINK       : on
>>> GUILE          : off
>>> KGDB           : on
>>> NLS            : on
>>> PORT_ICONV     : on
>>> PORT_READLINE  : on
>>> PYTHON         : on
>>> SOURCE_HIGHLIGHT: on
>>> SYSTEM_ICONV   : off
>>> SYSTEM_ZLIB    : on
>>> TUI            : on
>>> XXHASH         : on
>>> Shared Libs required:
>>> libzstd.so.1
>>> libxxhash.so.0
>>> libsource-highlight.so.4
>>> libreadline.so.8
>>> libpython3.11.so.1.0
>>> libmpfr.so.6
>>> libintl.so.8
>>> libiconv.so.2
>>> libgmp.so.10
>>> libexpat.so.1
>>> libboost_regex.so.1.85.0
>>> Annotations    :
>>> FreeBSD_version: 1500025
>>> build_timestamp: 2024-10-16T20:27:27+0000
>>> built_by       : poudriere-git-3.4.2
>>> cpe            : cpe:2.3:a:gnu:gdb:15.1:::::freebsd15:aarch64
>>> flavor         : py311
>>> port_checkout_unclean: no
>>> port_git_hash  : 82beca9e630
>>> ports_top_checkout_unclean: no
>>> ports_top_git_hash: 94c4ac6b071
>>> repo_type      : binary
>>> repository     : FreeBSD
>>> Flat size      : 58.5MiB
>>> Description    :
>>> GDB is a source-level debugger for Ada, C, C++, Objective-C, Pascal and
>>> many other languages.  GDB can target (i.e., debug programs running on)
>>> more than a dozen different processor architectures, and GDB itself can
>>> run on most popular GNU/Linux, Unix and Microsoft Windows variants.

Same gdb package version installation as in my context.
kgdb, of itself, should not be a source of the behavior
differences.

>>> tuexen@head:~ % 
>>> 
>>> Using kgdb from "pkg install gdb" and locally built world and kernel.
>>> 
>>> Best regards
>>> Michael
>>>> 
>>>> For reference:
>>>> 
>>>> . . . (deletion) . . .
> 

I'm not identifying anything else to investigate.


===
Mark Millard
marklmi at yahoo.com