Re: kernel core debugging

From: <tuexen_at_freebsd.org>
Date: Wed, 06 Nov 2024 22:24:15 UTC
> On 6. Nov 2024, at 19:27, Bakul Shah <bakul@iitbombay.org> wrote:
> 
> My guess: either the core and the kernel don't match or the core is corrupted.
I'm running a test script resulting in the panic. Doing the same on an
amd64 box results in kgdb being usable to debug the problem. Just want
to use the arm64 box for it...

Best regards
Michael
> 
>> On Nov 6, 2024, at 11:19 AM, Mark Millard <marklmi@yahoo.com> wrote:
>> 
>> On Nov 6, 2024, at 10:07, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>>> 
>>>> On 6. Nov 2024, at 17:51, Mark Millard <marklmi@yahoo.com> wrote:
>>>> 
>>>> 
>>>> 
>>>> On Nov 6, 2024, at 09:28, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>>>> 
>>>>>> On 6. Nov 2024, at 15:12, Mark Millard <marklmi@yahoo.com> wrote:
>>>>>> 
>>>>>> On Nov 6, 2024, at 01:44, tuexen@freebsd.org <tuexen@FreeBSD.org> wrote:
>>>>>> 
>>>>>>> is debugging a kernel panic by using kgdb or lldb on a core file
>>>>>>> supposed to work? At least it is not right now for me...
>>>>>> 
>>>>>> # kgdb /boot/kernel.GENERIC-NODEBUG/kernel /var/crash/vmcore.2
>>>>>> GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD]
>>>>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>>>> This is free software: you are free to change and redistribute it.
>>>>>> There is NO WARRANTY, to the extent permitted by law.
>>>>>> Type "show copying" and "show warranty" for details.
>>>>>> This GDB was configured as "aarch64-portbld-freebsd15.0".
>>>>>> Type "show configuration" for configuration details.
>>>>>> For bug reporting instructions, please see:
>>>>>> <https://www.gnu.org/software/gdb/bugs/>.
>>>>>> Find the GDB manual and other documentation resources online at:
>>>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>>>> 
>>>>>> For help, type "help".
>>>>>> Type "apropos word" to search for commands related to "word"...
>>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/kernel...
>>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/kernel.debug...
>>>>>> 
>>>>>> Unread portion of the kernel message buffer:
>>>>>> KDB: enter: manual escape to debugger
>>>>>> 
>>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/uhid.ko...
>>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/uhid.ko.debug...
>>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/wmt.ko...
>>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/wmt.ko.debug...
>>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/ums.ko...
>>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/ums.ko.debug...
>>>>>> Reading symbols from /boot/kernel.GENERIC-NODEBUG/zfs.ko...
>>>>>> Reading symbols from /usr/lib/debug//boot/kernel.GENERIC-NODEBUG/zfs.ko.debug...
>>>>>> 0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
>>>>>> warning: 404 /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c: No such file or directory
>>>>>> (kgdb) bt
>>>>>> #0  0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
>>>>>> #1  0xffff0000000ee6a8 in db_dump (dummy=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>) at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:596
>>>>>> #2  0xffff0000000ee478 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=true) at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:508
>>>>>> #3  0xffff0000000ee150 in db_command_loop () at /home/pkgbuild/worktrees/main/sys/ddb/db_command.c:555
>>>>>> #4  0xffff0000000f1ff4 in db_trap (type=<optimized out>, code=<optimized out>) at /home/pkgbuild/worktrees/main/sys/ddb/db_main.c:267
>>>>>> #5  0xffff000000568b0c in kdb_trap (type=60, code=0, tf=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/subr_kdb.c:790
>>>>>> #6  <signal handler called>
>>>>>> #7  kdb_enter (why=<optimized out>, msg=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/subr_kdb.c:556
>>>>>> #8  0xffff0000003625cc in vt_machine_kbdevent (vd=<optimized out>, c=<optimized out>) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:761
>>>>>> #9  vt_processkey (kbd=0xffffa000803caa80, vd=0xffff000000d24360 <vt_consdev>, c=-2147483514) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:903
>>>>>> #10 vt_kbdevent (kbd=0xffffa000803caa80, event=<optimized out>, arg=0xffff000000d24360 <vt_consdev>) at /home/pkgbuild/worktrees/main/sys/dev/vt/vt_core.c:1030
>>>>>> #11 0xffff0000001ea048 in kbdmux_intr (kbd=0xffffa000803caa80, arg=<optimized out>) at /home/pkgbuild/worktrees/main/sys/dev/kbdmux/kbdmux.c:565
>>>>>> #12 0xffff0000005839ac in taskqueue_run_locked (queue=queue@entry=0xffffa000803c9c00) at /home/pkgbuild/worktrees/main/sys/kern/subr_taskqueue.c:517
>>>>>> #13 0xffff000000583714 in taskqueue_run (queue=0xffffa000803c9c00) at /home/pkgbuild/worktrees/main/sys/kern/subr_taskqueue.c:532
>>>>>> #14 0xffff0000004bc114 in intr_event_execute_handlers (ie=0xffffa0008028ec00, p=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1183
>>>>>> #15 ithread_execute_handlers (ie=0xffffa0008028ec00, p=<optimized out>) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1196
>>>>>> #16 ithread_loop (arg=<optimized out>, arg@entry=0xffffa000803de5a0) at /home/pkgbuild/worktrees/main/sys/kern/kern_intr.c:1289
>>>>>> #17 0xffff0000004b700c in fork_exit (callout=0xffff0000004bbd78 <ithread_loop>, arg=0xffffa000803de5a0, frame=0xffff00010ca97a00) at /home/pkgbuild/worktrees/main/sys/kern/kern_fork.c:1151
>>>>>> #18 <signal handler called>
>>>>>> 
>>>>>> The context here was from an official PkgBase kernel and world
>>>>>> installation.
>>>>>> 
>>>>>> . . . (deletion) . . .
>>>>>> 
>>>>>> You may have to be more explicit about the specific of the
>>>>>> problem(s) you are seeing.
>>>>> OK. Here is what I am referring to:
>>>>> 
>>>>> tuexen@head:~ % sudo kgdb -c /var/crash/vmcore.last /boot/kernel/kernel
>>>> 
>>>> That command does not match the parameter order in the man page or
>>>> in my example.
>>>> 
>>>> man kgdb output shows kernel first, then core:
>>>> 
>>>> SYNOPSIS
>>>> kgdb [-a | -f | -fullname] [-b rate] [-q | -quiet] [-v] [-w]
>>>>     [-d crashdir] [-c core | -n dumpnr | -r device] [kernel [core]]
>>>> 
>>>> My example: # kgdb /boot/kernel.GENERIC-NODEBUG/kernel /var/crash/vmcore.2
>>>> 
>>>> You might want to see if using the other order makes a difference.
>>> No it doesn't. I'm specifying the core via the -c core option instead of 
>>> the second argument...
>> 
>> Of course. Sorry for the noise. (Does not look like this is
>> going to be one of my better mornings.)
>> 
>> I guess about all we learn is that your issue is somehow more
>> specific to your context rather than it being an example of
>> kgdb being generally broken.
>> 
>>> Best regards
>>> Michael
>>>> 
>>>>> Password:
>>>>> GNU gdb (GDB) 15.1 [GDB v15.1 for FreeBSD]
>>>>> Copyright (C) 2024 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>>> This is free software: you are free to change and redistribute it.
>>>>> There is NO WARRANTY, to the extent permitted by law.
>>>>> Type "show copying" and "show warranty" for details.
>>>>> This GDB was configured as "aarch64-portbld-freebsd15.0".
>>>>> Type "show configuration" for configuration details.
>>>>> For bug reporting instructions, please see:
>>>>> <https://www.gnu.org/software/gdb/bugs/>.
>>>>> Find the GDB manual and other documentation resources online at:
>>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>>> 
>>>>> For help, type "help".
>>>>> Type "apropos word" to search for commands related to "word"...
>>>>> Reading symbols from /boot/kernel/kernel...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
>>>>> 
>>>>> Unread portion of the kernel message buffer:
>>>>> panic: tcp_do_segment: sent too much
>>>>> cpuid = 1
>>>>> time = 1730910226
>>>>> KDB: stack backtrace:
>>>>> db_trace_self() at db_trace_self
>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x38
>>>>> vpanic() at vpanic+0x1a0
>>>>> panic() at panic+0x48
>>>>> tcp_do_segment() at tcp_do_segment+0x2794
>>>>> tcp_input_with_port() at tcp_input_with_port+0xcbc
>>>>> tcp_input() at tcp_input+0x10
>>>>> ip_input() at ip_input+0x35c
>>>>> netisr_dispatch_src() at netisr_dispatch_src+0xd8
>>>>> tunwrite() at tunwrite+0x2a8
>>>>> devfs_write_f() at devfs_write_f+0x108
>>>>> dofilewrite() at dofilewrite+0x7c
>>>>> kern_writev() at kern_writev+0x4c
>>>>> sys_writev() at sys_writev+0x40
>>>>> do_el0_sync() at do_el0_sync+0x60c
>>>>> handle_el0_sync() at handle_el0_sync+0x4c
>>>>> --- exception, esr 0x56000000
>>>>> KDB: enter: panic
>>>>> 
>>>>> Reading symbols from /boot/kernel/tcp_rack.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/tcp_rack.ko.debug...
>>>>> Reading symbols from /boot/kernel/tcphpts.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/tcphpts.ko.debug...
>>>>> Reading symbols from /boot/kernel/if_bridge.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/if_bridge.ko.debug...
>>>>> Reading symbols from /boot/kernel/bridgestp.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/bridgestp.ko.debug...
>>>>> Reading symbols from /boot/kernel/uhid.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/uhid.ko.debug...
>>>>> Reading symbols from /boot/kernel/wmt.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/wmt.ko.debug...
>>>>> Reading symbols from /boot/kernel/cc_newreno.ko...
>>>>> Reading symbols from /usr/lib/debug//boot/kernel/cc_newreno.ko.debug...
>>>>> 0xffff0000004b5644 in doadump (textdump=0) at /usr/home/tuexen/freebsd-src/sys/kern/kern_shutdown.c:404
>>>>> 404 dump_savectx();
>>>>> (kgdb) up
>>>>> #1  0x3fdb0000000e99f0 in ?? ()
>> 
>> Somehow it went from referencing the apparently correct/expected
>> 0xffff0000004b5644 (doadump) to referencing 0x3fdb0000000e99f0 .
>> There is also the odd "404 dump_savectx();".
>> 
>> If bt is used as the first command at the prompt, what does it
>> show for the backtrace? Anything interesting? Just #0 for
>> 0xffff0000004b5644 and #1 for 0x3fdb0000000e99f0 ?
>> 
>> 
>> My doadump line also has ", textdump@entry=212431136":
>> 
>> 0xffff00000050f3f0 in doadump (textdump=0, textdump@entry=212431136) at /home/pkgbuild/worktrees/main/sys/kern/kern_shutdown.c:404
>> 
>> But I'm not aware of the PkbBase build configuration information
>> being published for making comparisons with.
>> 
>> 
>>>>> (kgdb)  Initial frame selected; you cannot go up.
>>>>> (kgdb)  Initial frame selected; you cannot go up.
>>>>> (kgdb) quit
>>>>> tuexen@head:~ % pkg info gdb  gdb-15.1
>>>>> Name           : gdb
>>>>> Version        : 15.1
>>>>> Installed on   : Thu Oct 24 10:32:17 2024 CEST
>>>>> Origin         : devel/gdb
>>>>> Architecture   : FreeBSD:15:aarch64
>>>>> Prefix         : /usr/local
>>>>> Categories     : devel
>>>>> Licenses       : GPLv3
>>>>> Maintainer     : pizzamig@FreeBSD.org
>>>>> WWW            : https://www.gnu.org/software/gdb/
>>>>> Comment        : GNU Project Debugger
>>>>> Options        :
>>>>> BUNDLED_READLINE: off
>>>>> BUNDLED_ZLIB   : off
>>>>> DEBUGINFOD     : off
>>>>> GDB_LINK       : on
>>>>> GUILE          : off
>>>>> KGDB           : on
>>>>> NLS            : on
>>>>> PORT_ICONV     : on
>>>>> PORT_READLINE  : on
>>>>> PYTHON         : on
>>>>> SOURCE_HIGHLIGHT: on
>>>>> SYSTEM_ICONV   : off
>>>>> SYSTEM_ZLIB    : on
>>>>> TUI            : on
>>>>> XXHASH         : on
>>>>> Shared Libs required:
>>>>> libzstd.so.1
>>>>> libxxhash.so.0
>>>>> libsource-highlight.so.4
>>>>> libreadline.so.8
>>>>> libpython3.11.so.1.0
>>>>> libmpfr.so.6
>>>>> libintl.so.8
>>>>> libiconv.so.2
>>>>> libgmp.so.10
>>>>> libexpat.so.1
>>>>> libboost_regex.so.1.85.0
>>>>> Annotations    :
>>>>> FreeBSD_version: 1500025
>>>>> build_timestamp: 2024-10-16T20:27:27+0000
>>>>> built_by       : poudriere-git-3.4.2
>>>>> cpe            : cpe:2.3:a:gnu:gdb:15.1:::::freebsd15:aarch64
>>>>> flavor         : py311
>>>>> port_checkout_unclean: no
>>>>> port_git_hash  : 82beca9e630
>>>>> ports_top_checkout_unclean: no
>>>>> ports_top_git_hash: 94c4ac6b071
>>>>> repo_type      : binary
>>>>> repository     : FreeBSD
>>>>> Flat size      : 58.5MiB
>>>>> Description    :
>>>>> GDB is a source-level debugger for Ada, C, C++, Objective-C, Pascal and
>>>>> many other languages.  GDB can target (i.e., debug programs running on)
>>>>> more than a dozen different processor architectures, and GDB itself can
>>>>> run on most popular GNU/Linux, Unix and Microsoft Windows variants.
>> 
>> Same gdb package version installation as in my context.
>> kgdb, of itself, should not be a source of the behavior
>> differences.
>> 
>>>>> tuexen@head:~ % 
>>>>> 
>>>>> Using kgdb from "pkg install gdb" and locally built world and kernel.
>>>>> 
>>>>> Best regards
>>>>> Michael
>>>>>> 
>>>>>> For reference:
>>>>>> 
>>>>>> . . . (deletion) . . .
>>> 
>> 
>> I'm not identifying anything else to investigate.
>> 
>> 
>> ===
>> Mark Millard
>> marklmi at yahoo.com
>> 
>> 
>