rtw88 kernel panic with 14 stable and recent 13.2 stable

From: Jeremy <jeremy.m.cox_at_gmail.com>
Date: Thu, 31 Aug 2023 18:48:57 UTC
Hi all,

After attempting a clean install of the 14 stable snapshot my rtw88
wireless card stopped working and the kernel panicked. It did connect to
the wifi router properly and got an IP address, but any attempt to fetch
distfiles or do a pkg bootstrap resulted in a kernel panic. So I attempted
a clean install of a recent 13.2 stable snapshot (from August 25 2023) and
it did the same thing.

When I did a fresh install of 13.2 release, it started working again. I'm
using a desktop with a Ryzen 4600G CPU and an RT8822CE wireless pci card.
The kernel panics with a custom built kernelor the GENERIC kernel.

The kernel panic:

FreeBSD riotskates 13.2-STABLE FreeBSD 13.2-STABLE #0: Wed Aug 30 18:06:54
CDT 2023     root@riotskates:/usr/obj/usr/src/amd64.amd64/sys/VENUS  amd64

panic: vm_fault_lookup: fault on nofault entry, addr: 0xfffffe010666b000

GNU gdb (GDB) 13.1 [GDB v13.1 for FreeBSD]
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd13.2".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
panic: vm_fault_lookup: fault on nofault entry, addr: 0xfffffe010666b000
cpuid = 8
time = 1693437565
KDB: stack backtrace:
#0 0xffffffff8078181b at kdb_backtrace+0x6b
#1 0xffffffff80734c02 at vpanic+0x152
#2 0xffffffff80734aa3 at panic+0x43
#3 0xffffffff80a33b1a at vm_fault+0x12ea
#4 0xffffffff80a32751 at vm_fault_trap+0xb1
#5 0xffffffff80adee11 at trap_pfault+0x1f1
#6 0xffffffff80ab8618 at calltrap+0x8
#7 0xffffffff809954ad at linux_work_fn+0xed
#8 0xffffffff80796687 at taskqueue_run_locked+0x187
#9 0xffffffff80797983 at taskqueue_thread_loop+0xc3
#10 0xffffffff806f10e2 at fork_exit+0x82
#11 0xffffffff80ab968e at fork_trampoline+0xe
Uptime: 1m36s
Dumping 701 out of 15684
MB:..3%..12%..21%..32%..42%..51%..62%..71%..83%..92%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
55 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:396
#2  0xffffffff807347bf in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:484
#3  0xffffffff80734c6f in vpanic (fmt=<optimized out>,
    ap=ap@entry=0xfffffe01025e6b10) at /usr/src/sys/kern/kern_shutdown.c:923
#4  0xffffffff80734aa3 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:847
#5  0xffffffff80a33b1a in vm_fault_lookup (fs=0xfffffe01025e6b80)
    at /usr/src/sys/vm/vm_fault.c:842
#6  vm_fault (map=<optimized out>, vaddr=vaddr@entry=18446741879088656384,
    fault_type=2 '\002', fault_flags=fault_flags@entry=0,
    m_hold=m_hold@entry=0x0) at /usr/src/sys/vm/vm_fault.c:1477
#7  0xffffffff80a32751 in vm_fault_trap (map=<optimized out>,
    vaddr=vaddr@entry=18446741879088656392, fault_type=<optimized out>,
    fault_flags=fault_flags@entry=0, signo=0x0, ucode=0x0)
    at /usr/src/sys/vm/vm_fault.c:662
#8  0xffffffff80adee11 in trap_pfault (frame=0xfffffe01025e6d00,
    usermode=false, signo=<unavailable>, ucode=<unavailable>)
    at /usr/src/sys/amd64/amd64/trap.c:846
#9  <signal handler called>
#10 __skb_unlink (skb=0xfffffe0106e9f000, head=0xfffffe0104bfd0c0)
    at /usr/src/sys/compat/linuxkpi/common/include/linux/skbuff.h:616
#11 skb_unlink (skb=0xfffffe0106e9f000, head=0xfffffe0104bfd0c0)
    at /usr/src/sys/compat/linuxkpi/common/include/linux/skbuff.h:624
#12 rtw_c2h_work (work=<optimized out>)
    at /usr/src/sys/contrib/dev/rtw88/main.c:276
#13 0xffffffff809954ad in linux_work_fn (context=0xfffffe0104bfd0f8,
    pending=<optimized out>)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_work.c:299
#14 0xffffffff80796687 in taskqueue_run_locked (
    queue=queue@entry=0xfffff80006678600)
    at /usr/src/sys/kern/subr_taskqueue.c:514
#15 0xffffffff80797983 in taskqueue_thread_loop (arg=<optimized out>)
    at /usr/src/sys/kern/subr_taskqueue.c:826
#16 0xffffffff806f10e2 in fork_exit (
    callout=0xffffffff807978c0 <taskqueue_thread_loop>,
    arg=0xfffff80001a78640, frame=0xfffffe01025e6f40)
    at /usr/src/sys/kern/kern_fork.c:1094
#17 <signal handler called>
#18 0x4b061c04e8080000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x6213040a04e8
(kgdb)

The only things that jumped out at me were the __skb_unlink and skb_unlink
in frames 10 and 11. Frame 12 mentions something about rtw_c2h_work, but
the file main.c was last modified before 13.2 release came out. The other
culprit,  /usr/src/sys/compat/linuxkpicommon/include/linux/skbuff.h has
been modified more recently. So I reverted stable back to commit
1b33afb7e88b2e36db2083c488de41dbe097ef49 on stable 13 and rebuilt and
reinstalled it, and the wireless card seems to work again. I am able to use
pkg update and fetch distfiles without a kernel panic.

Essentially on the stable 13 branch commits:

de40bc6f3d6bf8ab1a4b546630cc847a3b8c5113  LinuxKPI: skbuff.h: fix
-Warray-bounds warnings    and

b30d9f2e5204b475cf79150542b758c3dab14e70  LinuxKPI: skbuff.h: add more
(skeleton) functions used by wireless drivers

were reverted (along with everything else after June 25th 2023) and it
seems to work. I'm no kernel expert and there were quite a few commits for
linux emulation and linuxkpi between then and now. But at least the
wireless card is working again. BTW, I have to manually download these
sources from github and copy them to a thumb drive because "git clone"ing a
branch doesn't seem to work at all, it just downloads for 30 seconds and
just sits and does nothing after that.

Thanks for your time,
Jeremy Cox
ReplyForward