[Bug 259458] iflib_rxeof NULL pointer crash with vmxnet3 driver

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 01 Nov 2021 10:13:04 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=259458

--- Comment #14 from Andriy Gapon <avg@FreeBSD.org> ---
Some data from the latest crash I've got.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x15fc12000
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80a5dd07
stack pointer           = 0x28:0xfffffe00c85cb930
frame pointer           = 0x28:0xfffffe00c85cb960
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_0)
trap number             = 12
panic: page fault
cpuid = 0
time = 1635553138
KDB: stack backtrace:
 stack1 db_trace_self_wrapper+0x2b vpanic+0x182 panic+0x43 trap_fatal+0x391
trap_pfault+0x4f trap+0x286 calltrap+0x8
 stack2 bounce_bus_dmamap_sync+0x17 iflib_fl_refill+0x31b _task_fn_rx+0x84b
gtaskqueue_run_locked+0xed gtaskqueue_thread_loop+0x7e fork_exit+0x6d
fork_trampoline+0xe

(kgdb) bt
[snip]
#13 0xffffffff8073dc23 in panic (fmt=0xffffffff81178120 <cnputs_mtx+24> "") at
/usr/src/sys/kern/kern_shutdown.c:909
#14 0xffffffff809d8b31 in trap_fatal (frame=0xfffffe00c85cb870, eva=5901459456)
at /usr/src/sys/amd64/amd64/trap.c:921
#15 0xffffffff809d8b8f in trap_pfault (frame=0xfffffe00c85cb870,
usermode=<optimized out>, signo=<optimized out>, ucode=<optimized out>) at
/usr/src/sys/amd64/amd64/trap.c:739
#16 0xffffffff809d8256 in trap (frame=0xfffffe00c85cb870) at
/usr/src/sys/amd64/amd64/trap.c:405
#17 <signal handler called>
#18 0xffffffff80a5dd07 in bounce_bus_dmamap_sync (dmat=0xfffff80002d83400,
map=0x15fc12000, op=1) at /usr/src/sys/x86/x86/busdma_bounce.c:973
#19 0xffffffff8085104b in bus_dmamap_sync (dmat=0xfffff80002d83400,
map=0x15fc12000, op=<error reading variable: Cannot access memory at address
0x1>) at /usr/src/sys/x86/include/bus_dma.h:125
#20 iflib_fl_refill (ctx=0xfffff80002dd7000, fl=<optimized out>,
count=<optimized out>) at /usr/src/sys/net/iflib.c:2109
#21 0xffffffff8084d5db in iflib_fl_refill_all (ctx=0xfffff80002dd7000,
fl=0xfffff80002d955c0) at /usr/src/sys/net/iflib.c:2188
#22 iflib_rxeof (rxq=<optimized out>, budget=<optimized out>) at
/usr/src/sys/net/iflib.c:2899
#23 _task_fn_rx (context=<optimized out>) at /usr/src/sys/net/iflib.c:3868
#24 0xffffffff807808bd in gtaskqueue_run_locked (queue=0xfffff800020c7200) at
/usr/src/sys/kern/subr_gtaskqueue.c:362
#25 0xffffffff8078068e in gtaskqueue_thread_loop (arg=<optimized out>) at
/usr/src/sys/kern/subr_gtaskqueue.c:537
#26 0xffffffff8070792d in fork_exit (callout=0xffffffff80780610
<gtaskqueue_thread_loop>, arg=0xfffffe00007f8008, frame=0xfffffe00c85cbc00) at
/usr/src/sys/kern/kern_fork.c:1088


(kgdb) fr 21
#21 0xffffffff8084d5db in iflib_fl_refill_all (ctx=0xfffff80002dd7000,
fl=0xfffff80002d955c0) at /usr/src/sys/net/iflib.c:2188
2188    in /usr/src/sys/net/iflib.c
(kgdb) p fl
$1 = (iflib_fl_t) 0xfffff80002d955c0

(kgdb) fr 20
#20 iflib_fl_refill (ctx=0xfffff80002dd7000, fl=<optimized out>,
count=<optimized out>) at /usr/src/sys/net/iflib.c:2109
2109    /usr/src/sys/net/iflib.c: No such file or directory.
(kgdb) i loc
iru = {iru_paddrs = 0xfffff80002d95640, iru_idxs = 0xfffff80002d95740, iru_pidx
= 1888, iru_qsidx = 0, iru_count = 32, iru_buf_size = 4096, iru_flidx = 1
'\001'}
cb_arg = {error = 0, seg = {ds_addr = 5901459456, ds_len = 4096}, nseg = 1}
sd_m = 0xfffffe00eabdc000
sd_map = 0xfffffe00eabe8000
sd_cl = 0xfffffe00eabe0000
sd_ba = 0xfffffe00eabe4000
idx = 1949
pidx = 1920
frag_idx = -1
n = <optimized out>
i = 29
credits = 1949
bus_addr = 18446735283517988864
cl = <optimized out>
err = <optimized out>
m = <optimized out>

(kgdb) p $1.ifl_size
$4 = 2048

(kgdb) p/x *$1.ifl_rx_bitmap@32
$7 = {0xffffffffffffffff <repeats 32 times>}

(kgdb)  p *$1
$8 = {ifl_cidx = 0, ifl_pidx = 1920, ifl_credits = 1920, ifl_gen = 0 '\000',
ifl_rxd_size = 0 '\000', ifl_rx_bitmap = 0xfffff80002d83200, ifl_fragidx = 128,
ifl_size = 2048, ifl_buf_size = 4096, ifl_cltype = 3, 
  ifl_zone = 0xfffff800029c5000, ifl_sds = {ifsd_map = 0xfffffe00eabe8000,
ifsd_m = 0xfffffe00eabdc000, ifsd_cl = 0xfffffe00eabe0000, ifsd_ba =
0xfffffe00eabe4000}, ifl_rxq = 0xfffffe00ea9f5000, ifl_id = 1 '\001', 
  ifl_buf_tag = 0xfffff80002d83400, ifl_ifdi = 0xfffff80002d9b4d0,
ifl_bus_addrs = {5901619200, 5901623296, 5901533184, 5901537280, 5901541376,
5901545472, 5901549568, 5901553664, 5901557760, 5901561856, 5901565952,
5901570048, 
    5901574144, 5901488128, 5901492224, 5901496320, 5901500416, 5901504512,
5901508608, 5901512704, 5901516800, 5901520896, 5901524992, 5901529088,
5901443072, 5901447168, 5901451264, 5901455360, 5901459456, 5901602816, 
    5901606912, 5901611008}, ifl_rxd_idxs = {1920, 1921, 1922, 1923, 1924,
1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937,
1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945, 1946, 1947, 2047, 1917, 
    1918, 1919}}

Things of note:
- frag_idx = -1 and ifl_rx_bitmap is indeed full
- i = 29 and there is a jump from 1947 to 2047 (maximum index as ifl_size =
2048) in ifl_rxd_idxs at positions 27 and 28

This makes me suspect that a concurrent refill topped the free list while the
refill in question was running.

-- 
You are receiving this mail because:
You are the assignee for the bug.