Re: kernel core debugging

From: Bakul Shah <bakul_at_iitbombay.org>
Date: Wed, 06 Nov 2024 19:27:43 UTC
My guess: either the core and the kernel don't match or the core is corrupted.

> On Nov 6, 2024, at 11:19 AM, Mark Millard <marklmi@yahoo.com> wrote:
> 
> On Nov 6, 2024, at 10:07, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>> 
>>> On 6. Nov 2024, at 17:51, Mark Millard <marklmi@yahoo.com> wrote:
>>> 
>>> 
>>> 
>>> On Nov 6, 2024, at 09:28, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>>> 
>>>>> On 6. Nov 2024, at 15:12, Mark Millard <marklmi@yahoo.com> wrote:
>>>>> 
>>>>> On Nov 6, 2024, at 01:44, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>>>>> 
>>>>>> is debugging a kernel panic by using kgdb or lldb on a core file
>>>>>> supposed to work? At least it is not right now for me...
>>>>> 
>>>>> # kgdb /boot/kernel.GENERIC-NODEBUG/kernel /var/crash/vmcore.2
>>>>> GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD]
>>>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>>> This is free software: you are free to change and redistribute it.
>>>>> There is NO WARRANTY, to the extent permitted by law.
>>>>> Type "show copying" and "show warranty" for details.
>>>>> This GDB was configured as "aarch64-portbld-freebsd15.0".
>>>>> Type "show configuration" for configuration details.
>>>>> For bug reporting instructions, please see:
>>>>> <https://www.gnu.org/software/gdb/bugs/>.
>>>>> Find the GDB manual and other documentation resources online at:
>>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>>> 
>>>>> For help, type "help".
>>>>> Type "apropos word" to search for commands related to "word"...
>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/kernel...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/kernel.debug...
>>>>> 
>>>>> Unread portion of the kernel message buffer:
>>>>> KDB: enter: manual escape to debugger
>>>>> 
>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/uhid.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/uhid.ko.debug...
>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/wmt.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/wmt.ko.debug...
>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/ums.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/ums.ko.debug...
>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/zfs.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/zfs.ko.debug...
>>>>> 0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
>>>>> warning: 404 /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c: No such file or directory
>>>>> (kgdb) bt
>>>>> #0  0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
>>>>> #1  0xffff0000000ee6a8 in db_dump (dummy=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>) at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:596
>>>>> #2  0xffff0000000ee478 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=true) at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:508
>>>>> #3  0xffff0000000ee150 in db_command_loop () at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:555
>>>>> #4  0xffff0000000f1ff4 in db_trap (type=<optimized out>, code=<optimized out>) at /home/pkgbuild/worktrees/main/sys/ddb/db_main.c:267
>>>>> #5  0xffff000000568b0c in kdb_trap (type=60, code=0, tf=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/subr_kdb.c:790
>>>>> #6  <signal handler called>
>>>>> #7  kdb_enter (why=<optimized out>, msg=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/subr_kdb.c:556
>>>>> #8  0xffff0000003625cc in vt_machine_kbdevent (vd=<optimized out>, c=<optimized out>) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:761
>>>>> #9  vt_processkey (kbd=0xffffa000803caa80, vd=0xffff000000d24360 <vt_consdev>, c=-2147483514) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:903
>>>>> #10 vt_kbdevent (kbd=0xffffa000803caa80, event=<optimized out>, arg=0xffff000000d24360 <vt_consdev>) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:1030
>>>>> #11 0xffff0000001ea048 in kbdmux_intr (kbd=0xffffa000803caa80, arg=<optimized out>) at /home/pkgbuild/worktrees/main/sys/dev/kbdmux/kbdmux.c:565
>>>>> #12 0xffff0000005839ac in taskqueue_run_locked (queue=queue@entry=0xffffa000803c9c00) at /home/pkgbuild/worktrees/main/sys/kern/subr_taskqueue.c:517
>>>>> #13 0xffff000000583714 in taskqueue_run (queue=0xffffa000803c9c00) at /home/pkgbuild/worktrees/main/sys/kern/subr_taskqueue.c:532
>>>>> #14 0xffff0000004bc114 in intr_event_execute_handlers (ie=0xffffa0008028ec00, p=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1183
>>>>> #15 ithread_execute_handlers (ie=0xffffa0008028ec00, p=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1196
>>>>> #16 ithread_loop (arg=<optimized out>, arg@entry=0xffffa000803de5a0) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1289
>>>>> #17 0xffff0000004b700c in fork_exit (callout=0xffff0000004bbd78 <ithread_loop>, arg=0xffffa000803de5a0, frame=0xffff00010ca97a00) at /home/pkgbuild/worktrees/main/sys/kern/kern_fork.c:1151
>>>>> #18 <signal handler called>
>>>>> 
>>>>> The context here was from an official PkgBase kernel and world
>>>>> installation.
>>>>> 
>>>>> . . . (deletion) . . .
>>>>> 
>>>>> You may have to be more explicit about the specific of the
>>>>> problem(s) you are seeing.
>>>> OK. Here is what I am referring to:
>>>> 
>>>> tuexen@head:~ % sudo kgdb -c /var/crash/vmcore.last /boot/kernel/kernel
>>> 
>>> That command does not match the parameter order in the man page or
>>> in my example.
>>> 
>>> man kgdb output shows kernel first, then core:
>>> 
>>> SYNOPSIS
>>>  kgdb [-a | -f | -fullname] [-b rate] [-q | -quiet] [-v] [-w]
>>>       [-d crashdir] [-c core | -n dumpnr | -r device] [kernel [core]]
>>> 
>>> My example: # kgdb /boot/kernel.GENERIC-NODEBUG/kernel /var/crash/vmcore.2
>>> 
>>> You might want to see if using the other order makes a difference.
>> No it doesn't. I'm specifying the core via the -c core option instead of 
>> the second argument...
> 
> Of course. Sorry for the noise. (Does not look like this is
> going to be one of my better mornings.)
> 
> I guess about all we learn is that your issue is somehow more
> specific to your context rather than it being an example of
> kgdb being generally broken.
> 
>> Best regards
>> Michael
>>> 
>>>> Password:
>>>> GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD]
>>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.
>>>> Type "show copying" and "show warranty" for details.
>>>> This GDB was configured as "aarch64-portbld-freebsd15.0".
>>>> Type "show configuration" for configuration details.
>>>> For bug reporting instructions, please see:
>>>> <https://www.gnu.org/software/gdb/bugs/>.
>>>> Find the GDB manual and other documentation resources online at:
>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>> 
>>>> For help, type "help".
>>>> Type "apropos word" to search for commands related to "word"...
>>>> Reading symbols from /boot/kernel/kernel...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
>>>> 
>>>> Unread portion of the kernel message buffer:
>>>> panic: tcp_do_segment: sent too much
>>>> cpuid = 1
>>>> time = 1730910226
>>>> KDB: stack backtrace:
>>>> db_trace_self() at db_trace_self
>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x38
>>>> vpanic() at vpanic+0x1a0
>>>> panic() at panic+0x48
>>>> tcp_do_segment() at tcp_do_segment+0x2794
>>>> tcp_input_with_port() at tcp_input_with_port+0xcbc
>>>> tcp_input() at tcp_input+0x10
>>>> ip_input() at ip_input+0x35c
>>>> netisr_dispatch_src() at netisr_dispatch_src+0xd8
>>>> tunwrite() at tunwrite+0x2a8
>>>> devfs_write_f() at devfs_write_f+0x108
>>>> dofilewrite() at dofilewrite+0x7c
>>>> kern_writev() at kern_writev+0x4c
>>>> sys_writev() at sys_writev+0x40
>>>> do_el0_sync() at do_el0_sync+0x60c
>>>> handle_el0_sync() at handle_el0_sync+0x4c
>>>> --- exception, esr 0x56000000
>>>> KDB: enter: panic
>>>> 
>>>> Reading symbols from /boot/kernel/tcp_rack.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/tcp_rack.ko.debug...
>>>> Reading symbols from /boot/kernel/tcphpts.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/tcphpts.ko.debug...
>>>> Reading symbols from /boot/kernel/if_bridge.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/if_bridge.ko.debug...
>>>> Reading symbols from /boot/kernel/bridgestp.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/bridgestp.ko.debug...
>>>> Reading symbols from /boot/kernel/uhid.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/uhid.ko.debug...
>>>> Reading symbols from /boot/kernel/wmt.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/wmt.ko.debug...
>>>> Reading symbols from /boot/kernel/cc_newreno.ko...
>>>> Reading symbols from /usr/lib/debug//boot/kernel/cc_newreno.ko.debug...
>>>> 0xffff0000004b5644 in doadump (textdump=0) at /usr/home/tuexen/freebsd-src/sys/kern/kern_shutdown.c:404
>>>> 404 dump_savectx();
>>>> (kgdb) up
>>>> #1  0x3fdb0000000e99f0 in ?? ()
> 
> Somehow it went from referencing the apparently correct/expected
> 0xffff0000004b5644 (doadump) to referencing 0x3fdb0000000e99f0 .
> There is also the odd "404 dump_savectx();".
> 
> If bt is used as the first command at the prompt, what does it
> show for the backtrace? Anything interesting? Just #0 for
> 0xffff0000004b5644 and #1 for 0x3fdb0000000e99f0 ?
> 
> 
> My doadump line also has ", textdump@entry=212431136":
> 
> 0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
> 
> But I'm not aware of the PkbBase build configuration information
> being published for making comparisons with.
> 
> 
>>>> (kgdb)  Initial frame selected; you cannot go up.
>>>> (kgdb)  Initial frame selected; you cannot go up.
>>>> (kgdb) quit
>>>> tuexen@head:~ % pkg info gdb  gdb-15.1
>>>> Name           : gdb
>>>> Version        : 15.1
>>>> Installed on   : Thu Oct 24 10:32:17 2024 CEST
>>>> Origin         : devel/gdb
>>>> Architecture   : FreeBSD:15:aarch64
>>>> Prefix         : /usr/local
>>>> Categories     : devel
>>>> Licenses       : GPLv3
>>>> Maintainer     : pizzamig@FreeBSD.org
>>>> WWW            : https://www.gnu.org/software/gdb/
>>>> Comment        : GNU Project Debugger
>>>> Options        :
>>>> BUNDLED_READLINE: off
>>>> BUNDLED_ZLIB   : off
>>>> DEBUGINFOD     : off
>>>> GDB_LINK       : on
>>>> GUILE          : off
>>>> KGDB           : on
>>>> NLS            : on
>>>> PORT_ICONV     : on
>>>> PORT_READLINE  : on
>>>> PYTHON         : on
>>>> SOURCE_HIGHLIGHT: on
>>>> SYSTEM_ICONV   : off
>>>> SYSTEM_ZLIB    : on
>>>> TUI            : on
>>>> XXHASH         : on
>>>> Shared Libs required:
>>>> libzstd.so.1
>>>> libxxhash.so.0
>>>> libsource-highlight.so.4
>>>> libreadline.so.8
>>>> libpython3.11.so.1.0
>>>> libmpfr.so.6
>>>> libintl.so.8
>>>> libiconv.so.2
>>>> libgmp.so.10
>>>> libexpat.so.1
>>>> libboost_regex.so.1.85.0
>>>> Annotations    :
>>>> FreeBSD_version: 1500025
>>>> build_timestamp: 2024-10-16T20:27:27+0000
>>>> built_by       : poudriere-git-3.4.2
>>>> cpe            : cpe:2.3:a:gnu:gdb:15.1:::::freebsd15:aarch64
>>>> flavor         : py311
>>>> port_checkout_unclean: no
>>>> port_git_hash  : 82beca9e630
>>>> ports_top_checkout_unclean: no
>>>> ports_top_git_hash: 94c4ac6b071
>>>> repo_type      : binary
>>>> repository     : FreeBSD
>>>> Flat size      : 58.5MiB
>>>> Description    :
>>>> GDB is a source-level debugger for Ada, C, C++, Objective-C, Pascal and
>>>> many other languages.  GDB can target (i.e., debug programs running on)
>>>> more than a dozen different processor architectures, and GDB itself can
>>>> run on most popular GNU/Linux, Unix and Microsoft Windows variants.
> 
> Same gdb package version installation as in my context.
> kgdb, of itself, should not be a source of the behavior
> differences.
> 
>>>> tuexen@head:~ % 
>>>> 
>>>> Using kgdb from "pkg install gdb" and locally built world and kernel.
>>>> 
>>>> Best regards
>>>> Michael
>>>>> 
>>>>> For reference:
>>>>> 
>>>>> . . . (deletion) . . .
>> 
> 
> I'm not identifying anything else to investigate.
> 
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> 
>