Re: Confused about the kernel stack backtrace
- Reply: Mark Johnston : "Re: Confused about the kernel stack backtrace"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 26 Feb 2023 06:22:08 UTC
> On Feb 24, 2023, at 11:34 PM, Mark Johnston <markj@freebsd.org> wrote: > > On Fri, Feb 24, 2023 at 09:26:28PM +0800, Zhenlei Huang wrote: >> Hi, >> >> The job FreeBSD-main-amd64-test on ci is failing, and some kernel stack backtrace [1] >> looks weird. > > BTW, it is preferred to use public mailing lists for this kind of > question. Also CC freebsd-hackers . > >>> Memory modified after free 0xfffffe00ccc29000(8184) val=0 @ 0xfffffe00ccc29698 >>> panic: Most recently used by temp >> >>> cpuid = 0 >>> time = 1677239728 >>> KDB: stack backtrace: >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0084e3eaa0 >>> vpanic() at vpanic+0x152/frame 0xfffffe0084e3eaf0 >>> panic() at panic+0x43/frame 0xfffffe0084e3eb50 >>> mtrash_dtor() at mtrash_dtor/frame 0xfffffe0084e3eb70 >>> item_ctor() at item_ctor+0x11f/frame 0xfffffe0084e3ebc0 >>> malloc() at malloc+0x7f/frame 0xfffffe0084e3ec00 >>> g_read_data() at g_read_data+0x82/frame 0xfffffe0084e3ec40 >>> g_use_g_read_data() at g_use_g_read_data+0x46/frame 0xfffffe0084e3ec60 >>> readsuper() at readsuper+0x29/frame 0xfffffe0084e3ecf0 >>> ffs_sbget() at ffs_sbget+0x84/frame 0xfffffe0084e3ed70 >>> g_label_ufs_taste_common() at g_label_ufs_taste_common+0x8b/frame 0xfffffe0084e3edc0 >>> g_label_taste() at g_label_taste+0x1d0/frame 0xfffffe0084e3eea0 >>> g_new_provider_event() at g_new_provider_event+0x9a/frame 0xfffffe0084e3eec0 >>> g_run_events() at g_run_events+0x104/frame 0xfffffe0084e3eef0 >>> fork_exit() at fork_exit+0x80/frame 0xfffffe0084e3ef30 >>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0084e3ef30 >>> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- >>> KDB: enter: panic >> >> The source code sys/vm/uma_dbg.c shows clearly that the panic comes from `mtrash_ctor()`. >> >> Why KDB shows that the panic is from `mtrash_dtor()` ? > > I couldn't reproduce this locally (i.e., the stack trace looks correct > when the UAF is triggered), but the problem is a bit clearer after > grabbing a kernel from artifact.ci.freebsd.org <http://artifact.ci.freebsd.org/>. Maybe a hand-crafted kernel module which modify after free intensionally can reproduce this easily. > > In mtrash_ctor(), the final instruction is a call to panic(): > > (kgdb) disas mtrash_ctor > ... > 0xffffffff80f766be <+110>: mov 0x10(%rax),%rsi > 0xffffffff80f766c2 <+114>: mov $0xffffffff81200154,%rdi > 0xffffffff80f766c9 <+121>: xor %eax,%eax > 0xffffffff80f766cb <+123>: call 0xffffffff80bed350 <panic> > (kgdb) > > This works because the compiler knows that panic() never returns. > > However, the return address saved on the stack will still point to the > "next" instruction, which is now outside of the bounds of the > mtrash_ctor symbol, and it happens to be the first instruction of > mtrash_dtor(): > > (kgdb) x/2i 0xffffffff80f766cb > > 0xffffffff80f766cb <mtrash_ctor+123>: call 0xffffffff80bed350 <panic> > 0xffffffff80f766d0 <mtrash_dtor>: push %rbp > > So DDB's stack unwinder reports the call as coming from mtrash_dtor() > instead of mtrash_ctor(). Thanks for the detailed analyzation ! > > I'm not sure how to fix this. Instead of resolving the symbol > containing the return address, it could maybe resolve the symbol > containing the previous instruction, but variable-length instructions > make that tricky. I'd like to look at this issue when I have time. Best regards, Zhenlei