[Bug 275470] Kernel Panic in IPFW when adding entries to table

From: <bugzilla-noreply_at_freebsd.org>
Date: Fri, 01 Dec 2023 12:46:35 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275470

            Bug ID: 275470
           Summary: Kernel Panic in IPFW when adding entries to table
           Product: Base System
           Version: 14.0-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: thierry.dussuet@protonmail.com

Hi everyone when adding entries to an ipfw table through a cron job, after a
few days (8-9 days) it triggers a kernel panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 11; apic id = 0b
fault virtual address   = 0x2c
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff81f5daf2
stack pointer           = 0x28:0xfffffe016860c800
frame pointer           = 0x28:0xfffffe016860c900
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 78144 (ipfw)
rdi: fffff800050f8000 rsi: 0000000000000000 rdx: 0000000000001000
rcx: 0000000000000040  r8: fffffe01c1a59000  r9: 0000000000000b40
rax: 000000000000f4fd rbx: fffff800050f8000 rbp: fffffe016860c900
r10: 4000000000000000 r11: fffffe0167fcd540 r12: fffff802f021d700
r13: 0000000000000002 r14: fffffe016860c958 r15: fffffe016860c888
trap number             = 12
panic: page fault
cpuid = 11
time = 1701385226
KDB: stack backtrace:
#0 0xffffffff80b9002d at kdb_backtrace+0x5d
#1 0xffffffff80b43132 at vpanic+0x132
#2 0xffffffff80b42ff3 at panic+0x43
#3 0xffffffff8100c85c at trap_fatal+0x40c
#4 0xffffffff8100c8af at trap_pfault+0x4f
#5 0xffffffff80fe3828 at calltrap+0x8
#6 0xffffffff81f530bb at add_table_entry+0x54b
#7 0xffffffff81f572e0 at manage_table_ent_v1+0x1c0
#8 0xffffffff81f4d069 at ipfw_ctl3+0x689
#9 0xffffffff80beadc3 at sogetopt+0xd3
#10 0xffffffff80bef79f at kern_getsockopt+0xaf
#11 0xffffffff80bef6c2 at sys_getsockopt+0x52
#12 0xffffffff8100d119 at amd64_syscall+0x109
#13 0xffffffff80fe413b at fast_syscall_common+0xf8
Uptime: 8d23h59m54s

# uname -v
FreeBSD 14.0-RELEASE #0 releng/14.0-n265380-f9716eee8ab4: Fri Nov 10 05:57:23
UTC 2023    
root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC

The cron job does:
rsync -aqz rsync-mirrors.uceprotect.net::RBLDNSD-ALL /tmp/dnsbl/

awk '/^[0-9]/ && !/127.0.0/ {print $1}' /tmp/dnsbl/dnsbl-1.uceprotect.net |
xargs -n10 -P1 ipfw -q table 53 add

# wc -l /tmp/dnsbl/dnsbl-1.uceprotect.net
   99568 /tmp/dnsbl/dnsbl-1.uceprotect.net

The ipfw tables in use:
00001 deny ip from table(1) to me
00002 deny ip from table(22) to me
00003 deny ip from table(42) to me
00004 deny ip from table(53) to me
(and then other rules including nat)

# ipfw table 53 detail
--- table(53), set(0) ---
 kindex: 4, type: addr
 references: 1, valtype: legacy
 algorithm: addr:radix
 items: 49760, size: 5971496
 IPv4 algorithm radix info
  items: 49760 itemsize: 120
 IPv6 algorithm radix info
  items: 0 itemsize: 128

The -n10 and -P1 arguments for xargs were a try to reduce parallel calls to
ipfw, it seems to have delayed the panics by a few days but I can not say for
certain.

Is there any missing information, or action, which could help track down what
is happening? Also willing to switch to -CURRENT and try any patches if that
might help.

(Found #272073 with the workaround of setting sysctl kern.ipc.mb_use_ext_pgs=0
for what seems like a similar kernel panic reason, although from a different
path inside ipfw)

-- 
You are receiving this mail because:
You are the assignee for the bug.