Re: Confused about the kernel stack backtrace
- In reply to: Zhenlei Huang : "Re: Confused about the kernel stack backtrace"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 27 Feb 2023 16:36:58 UTC
On Sun, Feb 26, 2023 at 02:22:08PM +0800, Zhenlei Huang wrote: > > On Feb 24, 2023, at 11:34 PM, Mark Johnston <markj@freebsd.org> wrote: > >>> Memory modified after free 0xfffffe00ccc29000(8184) val=0 @ 0xfffffe00ccc29698 > >>> panic: Most recently used by temp > >> > >>> cpuid = 0 > >>> time = 1677239728 > >>> KDB: stack backtrace: > >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0084e3eaa0 > >>> vpanic() at vpanic+0x152/frame 0xfffffe0084e3eaf0 > >>> panic() at panic+0x43/frame 0xfffffe0084e3eb50 > >>> mtrash_dtor() at mtrash_dtor/frame 0xfffffe0084e3eb70 > >>> item_ctor() at item_ctor+0x11f/frame 0xfffffe0084e3ebc0 > >>> malloc() at malloc+0x7f/frame 0xfffffe0084e3ec00 > >>> g_read_data() at g_read_data+0x82/frame 0xfffffe0084e3ec40 > >>> g_use_g_read_data() at g_use_g_read_data+0x46/frame 0xfffffe0084e3ec60 > >>> readsuper() at readsuper+0x29/frame 0xfffffe0084e3ecf0 > >>> ffs_sbget() at ffs_sbget+0x84/frame 0xfffffe0084e3ed70 > >>> g_label_ufs_taste_common() at g_label_ufs_taste_common+0x8b/frame 0xfffffe0084e3edc0 > >>> g_label_taste() at g_label_taste+0x1d0/frame 0xfffffe0084e3eea0 > >>> g_new_provider_event() at g_new_provider_event+0x9a/frame 0xfffffe0084e3eec0 > >>> g_run_events() at g_run_events+0x104/frame 0xfffffe0084e3eef0 > >>> fork_exit() at fork_exit+0x80/frame 0xfffffe0084e3ef30 > >>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0084e3ef30 > >>> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > >>> KDB: enter: panic > >> > >> The source code sys/vm/uma_dbg.c shows clearly that the panic comes from `mtrash_ctor()`. > >> > >> Why KDB shows that the panic is from `mtrash_dtor()` ? > > > > I couldn't reproduce this locally (i.e., the stack trace looks correct > > when the UAF is triggered), but the problem is a bit clearer after > > grabbing a kernel from artifact.ci.freebsd.org <http://artifact.ci.freebsd.org/>. > > Maybe a hand-crafted kernel module which modify after free intensionally can reproduce this easily. > > > > > In mtrash_ctor(), the final instruction is a call to panic(): > > > > (kgdb) disas mtrash_ctor > > ... > > 0xffffffff80f766be <+110>: mov 0x10(%rax),%rsi > > 0xffffffff80f766c2 <+114>: mov $0xffffffff81200154,%rdi > > 0xffffffff80f766c9 <+121>: xor %eax,%eax > > 0xffffffff80f766cb <+123>: call 0xffffffff80bed350 <panic> > > (kgdb) > > > > This works because the compiler knows that panic() never returns. > > > > However, the return address saved on the stack will still point to the > > "next" instruction, which is now outside of the bounds of the > > mtrash_ctor symbol, and it happens to be the first instruction of > > mtrash_dtor(): > > > > (kgdb) x/2i 0xffffffff80f766cb > > > > 0xffffffff80f766cb <mtrash_ctor+123>: call 0xffffffff80bed350 <panic> > > 0xffffffff80f766d0 <mtrash_dtor>: push %rbp > > > > So DDB's stack unwinder reports the call as coming from mtrash_dtor() > > instead of mtrash_ctor(). > > Thanks for the detailed analyzation ! > > > > > I'm not sure how to fix this. Instead of resolving the symbol > > containing the return address, it could maybe resolve the symbol > > containing the previous instruction, but variable-length instructions > > make that tricky. > > I'd like to look at this issue when I have time. It was pointed out in private mail that there's no need to use an instruction boundary when resolving a text address. In fact, db_nextframe() already knows about this problem, and I believe the patch below is sufficient to fix the stack trace. But we need some way to test it. diff --git a/sys/amd64/amd64/db_trace.c b/sys/amd64/amd64/db_trace.c index c38f4f6a4860..70aa6c3acdd1 100644 --- a/sys/amd64/amd64/db_trace.c +++ b/sys/amd64/amd64/db_trace.c @@ -200,7 +200,10 @@ db_nextframe(struct amd64_frame **fp, db_addr_t *ip, struct thread *td) return; } - db_print_stack_entry(name, rip, &(*fp)->f_frame); + /* + * See the comment above the db_search_symbol() call. + */ + db_print_stack_entry(name, rip - 1, &(*fp)->f_frame); /* * Point to base of trapframe which is just above the