[Bug 281825] SDT tracepoints are not cleaned up when a module is unloaded

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 02 Oct 2024 20:39:52 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281825

            Bug ID: 281825
           Summary: SDT tracepoints are not cleaned up when a module is
                    unloaded
           Product: Base System
           Version: Unspecified
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: jhb@FreeBSD.org

Kernel modules may reference SDT probes defined in another module (or the
kernel itself).  A specific example of this are all the mbuf probes in
<sys/mbuf.h> for functions like m_get().  Kernel modules which use these inline
functions will include a tracepoint that gets registered during kldload in
sdt_kld_load_probes.  However, sdt_kldunload_try() doesn't cleanup any of the
state initialized in sdt_kld_load_probes, only the state initialized in
set_kld_load_providers().  As a result, this can leave dangling pointers (e.g.
in the tp->probe->tracepoint_list) when a module is unloaded.

The panic I've seen is when re-loading a previously-unloaded module that
crashes in sdt_kld_load_probes() when it walks off an invalid pointer in the
STAILQ_INSERT_TAIL of the tracepoint_list.  However, that panic is a bit
finicky and not easy to reproduce.  A simpler reproducer is below:

kldload sdt
kldload nvmft
kldunload nvmft
dtrace -n m-get

Panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 6; apic id = 06
fault virtual address   = 0xffffffff8283d008
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff82816b96
stack pointer           = 0x28:0xfffffe00dc1e9730
frame pointer           = 0x28:0xfffffe00dc1e9740
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 1115 (dtrace)
rdi: 0000000000000001 rsi: ffffffff80f3a4fc rdx: 000000000000000f
rcx: 0000000080040033  r8: 0000000000000016  r9: 00000000000f4240
rax: 0000000080050033 rbx: fffffe00dc1e98e8 rbp: fffffe00dc1e9740
r10: 0000000000000000 r11: 0000000000000000 r12: 0000000000000000
r13: ffffffff82816b20 r14: ffffffff8283d000 r15: 0000000000000000
trap number             = 12
panic: page fault
cpuid = 6
time = 1727901518
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00dc1e9400
vpanic() at vpanic+0x13f/frame 0xfffffe00dc1e9530
panic() at panic+0x43/frame 0xfffffe00dc1e9590
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00dc1e95f0
trap_pfault() at trap_pfault+0xa0/frame 0xfffffe00dc1e9660
calltrap() at calltrap+0x8/frame 0xfffffe00dc1e9660
--- trap 0xc, rip = 0xffffffff82816b96, rsp = 0xfffffe00dc1e9730, rbp =
0xfffffe00dc1e9740 ---
sdt_probe_update_cb() at sdt_probe_update_cb+0x76/frame 0xfffffe00dc1e9740
smp_rendezvous_action() at smp_rendezvous_action+0x9d/frame 0xfffffe00dc1e9780
smp_rendezvous_cpus() at smp_rendezvous_cpus+0x145/frame 0xfffffe00dc1e9840
smp_rendezvous() at smp_rendezvous+0x34/frame 0xfffffe00dc1e98d0
sdt_enable() at sdt_enable+0xae/frame 0xfffffe00dc1e9910
dtrace_ecb_create_enable() at dtrace_ecb_create_enable+0xee8/frame
0xfffffe00dc1e99a0
dtrace_match() at dtrace_match+0x444/frame 0xfffffe00dc1e9a80
dtrace_enabling_match() at dtrace_enabling_match+0xc8/frame 0xfffffe00dc1e9b10
dtrace_ioctl() at dtrace_ioctl+0x178b/frame 0xfffffe00dc1e9c00
devfs_ioctl() at devfs_ioctl+0xd1/frame 0xfffffe00dc1e9c50
vn_ioctl() at vn_ioctl+0xbc/frame 0xfffffe00dc1e9cc0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00dc1e9ce0
kern_ioctl() at kern_ioctl+0x286/frame 0xfffffe00dc1e9d40
sys_ioctl() at sys_ioctl+0x12d/frame 0xfffffe00dc1e9e00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00dc1e9f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00dc1e9f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0xc9cf811a9fa, rsp =
0xc9ced0c5c28, rbp = 0xc9ced0c5c70 ---

-- 
You are receiving this mail because:
You are the assignee for the bug.