[Bug 272947] cxgbei: kernel panic in soreceive when hw.cxgbe.nofldtxq="-24"

From: <bugzilla-noreply_at_freebsd.org>
Date: Sat, 05 Aug 2023 17:41:17 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272947

Greg Becker <greg@codeconcepts.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |greg@codeconcepts.com

--- Comment #1 from Greg Becker <greg@codeconcepts.com> ---
I am seeing something similar, but with a very simple setup.  I have two
dual E5-2697a Supermicro based machines each with a T62100-SO-CR adapter.

t6nex0: <Chelsio T62100-SO-CR> mem
0xfb300000-0xfb37ffff,0xfa000000-0xfaffffff,0xfb984000-0xfb985fff irq 56 at
device 0.4 numa-domain 1 on pci12
cc0: <port 0> numa-domain 1 on t6nex0
cc0: Ethernet address: 00:07:43:44:0a:d0
cc0: 16 txq, 8 rxq (NIC); 8 txq (TOE), 2 rxq (TOE)
cc1: <port 1> numa-domain 1 on t6nex0
cc1: Ethernet address: 00:07:43:44:0a:d8
cc1: 16 txq, 8 rxq (NIC); 8 txq (TOE), 2 rxq (TOE)
ccr0: <Chelsio Crypto Accelerator> numa-domain 1 on t6nex0
t6nex0: PCIe gen3 x16, 2 ports, 22 MSI-X interrupts, 70 eq, 21 iq

I set up an NFS server to serve over cc0 using default settings except for mtu
9000:

$ ifconfig cc0
cc0: flags=1008843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,LOWER_UP> metric 0
mtu 9000
       
options=66ec07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6,HWSTATS,HWRXTSTMP,MEXTPG,VXLAN_HWCSUM,VXLAN_HWTSO>
        ether 00:07:43:44:0a:d0
        inet 172.16.100.200 netmask 0xffffff00 broadcast 172.16.100.255
        media: Ethernet autoselect (100GBase-CR4 <full-duplex,rxpause,txpause>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>


And this works fine.  However, if I unmount the client and run 'ifconfig cc0
toe' then during mount my machine crashes with the following:


Fatal trap 12: page fault while in kernel mode
cpuid = 19; apic id = 26
fault virtual address   = 0x0
fault code              = supervisor read instruction, page not present
instruction pointer     = 0x20:0x0
stack pointer           = 0x28:0xfffffe03f0738ce8
frame pointer           = 0x28:0xfffffe03f0738d40
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 4777 (mount_nfs)
rdi: fffff811c4d9bb40 rsi: 0000000000000000 rdx: fffffe03f0738da0
rcx: 0000000000000000  r8: 0000000000000000  r9: 0000000000000000
rax: ffffffff82af1148 rbx: 000000000000003c rbp: fffffe03f0738d40
r10: 0000000000000000 r11: fffffe0269525540 r12: fffff811c4d9bb40
r13: 0000000000000000 r14: fffffe03f0738da0 r15: fffffe0269525020
trap number             = 12
panic: page fault
cpuid = 19
time = 1691256350
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe03f07389b0
vpanic() at vpanic+0x130/frame 0xfffffe03f0738ae0
panic() at panic+0x43/frame 0xfffffe03f0738b40
trap_fatal() at trap_fatal+0x40c/frame 0xfffffe03f0738ba0
trap_pfault() at trap_pfault+0xae/frame 0xfffffe03f0738c10
calltrap() at calltrap+0x8/frame 0xfffffe03f0738c10
--- trap 0xc, rip = 0, rsp = 0xfffffe03f0738ce8, rbp = 0xfffffe03f0738d40 ---
??() at 0/frame 0xfffffe03f0738d40
dofilewrite() at dofilewrite+0x82/frame 0xfffffe03f0738d90
sys_write() at sys_write+0xc2/frame 0xfffffe03f0738e00
amd64_syscall() at amd64_syscall+0x138/frame 0xfffffe03f0738f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe03f0738f30
--- syscall (4, FreeBSD ELF64, write), rip = 0x224e7fd0b1ca, rsp =
0x224e7e1c2c88, rbp = 0x224e7e1c2d80 ---
KDB: enter: panic
[ thread pid 4777 tid 101361 ]
Stopped at      kdb_enter+0x32: movq    $0,0xfee2a3(%rip)


This is on a 14-current build I rebased to this morning:

FreeBSD sm1.cc.codeconcepts.com 14.0-CURRENT FreeBSD 14.0-CURRENT amd64 1400094
#23 main-n264571-6f15b7e19952: Sat Aug  5 09:12:25 CDT 2023    
greg@sm1.cc.codeconcepts.com:/usr/obj/usr/src/amd64.amd64/sys/SM1 amd64

git log --oneline
6f15b7e19952 (HEAD -> main, origin/main, origin/HEAD) ldconfig script: enable
32-bit compat on aarch64

-- 
You are receiving this mail because:
You are the assignee for the bug.