ifa_free panic in 8 stable

Mon Apr 26 23:52:21 UTC 2010

Hi

I have a dual processor, single core amd64 machine running a recent
cvsup of 8 stable. On this development machine I use netgraph(3) to
implement one to one NAT with one ng_nat(4) node. I use ipfw(8) rules
to direct traffic to netgraph nodes as needed based on table entries
using an ng_ipfw(4) node. When I load test ng_nat on the development
system using iperf(1) running on the independent system, the
development system panics after a couple of days as follows.

panic: negative refcount 0xffffff0002a344d4
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
panic() at panic+0x182
ifa_free() at ifa_free+0x5d
ip_output() at ip_output+0x49d
ip_forward() at ip_forward+0x199
ip_input() at ip_input+0x4cd
ng_ipfw_rcvdata() at ng_ipfw_rcvdata+0xadVOP_STRATEGY: bp is not
locked but should be

KDB: enter: lock violation
n[thread pid 17 tid 100042 ]
Stopped at      kdb_enter+0x3d: movq    $0,0x6bb730(%rip)

This panic is repeatable on this machine. I am unable to obtain a core
dump after these panics; after I attempt to dump core using panic the
system does not respond and must be power cycled. I first encountered
this problem with 8.0p1. I've reproduced this problem with both em and
bge interfaces.

I've looked around in the functions mentioned in the backtrace, but
haven't made any progress in identifying why this panic
occurs. Please share any suggestions you think of for tracking down
the source of this problem.

My kernel is configured with options DEBUG=-g, KDB, DDB, KDB_TRACE,
BREAK_TO_DEBUGGER, INVARIANTS, INVARIANT_SUPPORT, WITNESS,
DEBUG_LOCKS, DEBUG_VFS_LOCKS, DIAGNOSTIC, SW_WATCHDOG, DEADLKRES,
IPFIREWALL, IPFIREWALL_VERBOSE, IPFIREWALL_VERBOSE_LIMIT=100,
IPFIREWALL_FORWARD, and IPDIVERT.

I use the following ipfw(8) rules to direct traffic from the
independent system to netgraph and vice versa. (x and y below replace
the first two octets of the globally routable addresses I'm using in
this test.)

# direct traffic from the independent system into ng_nat
01100 netgraph tablearg ip from table(87) to any in
# direct traffic from the internet into ng_nat
01110 netgraph tablearg ip from any to table(88) in via vlan615
# forward NATed traffic to the subnet's router if it isn't local
01120 fwd x.y.254.1 ip4 from x.y.254.0/25 to not x.y.254.0/25 in via vlan613
# pass traffic after it is NATed, so the default deny rule doesn't block it
01130 allow ip from any to table(87)
01140 allow ip from table(88) to any

ipfw(8) table 87 contains the entry 10.10.0.10/32 200254017 and table
88 contains the entry x.y.254.17/32 100254017

The above two table entries direct traffic to the following ng_nat(4)
node.

  Name: NAT0254017 Type: nat             ID: 0000000b   Num hooks: 2
  Local hook      Peer name       Peer type    Peer ID         Peer hook      
  ----------      ---------       ---------    -------         ---------      
  in              ipfw            ipfw         00000001        100254017      
  out             ipfw            ipfw         00000001        200254017      

This ng_nag(4) node was created using the following commands.

ngctl mkpeer ipfw: nat 100254017 out
ngctl name ipfw:100254017 NAT0254017
ngctl connect ipfw: NAT0254017: 200254017 in
ngctl msg NAT0254017: setaliasaddr x.y.254.17
ngctl msg NAT0254017: redirectaddr { "local_addr=10.10.0.10" "alias_addr=x.y.254.17" 'description="Static NAT" }

I've assigned the independent system the address 10.10.0.10 on
vlan613 with a default router of 10.10.0.1. The development system's
address on vlan613 is 10.10.0.1. Based on the above setup, traffic
from the independent system is NATed by the development syste to IP
address x.y.254.17. I use iperf -d -U -P 20 for the load testing with
another system outside of the test setup acting as an iperf server.

Erik