Netgraph/mpd5 stability issues

Eugene Grosbein egrosbein at
Wed Feb 16 09:04:21 UTC 2011

On 16.02.2011 14:46, Gleb Smirnoff wrote:
> On Wed, Feb 16, 2011 at 10:13:59AM +0600, Eugene Grosbein wrote:
> E> I run AMD64 with 4GB of memory, lots of memory is free and
> E> I still get panics often, sometimes two in a couple of hours.
> E> It does not seem memory exhaustion to me. It seems as very low probable race
> E> that happens occasionally but may happen any time.
> E> 
> E> With Gleb's patch, it is obvious that panic happens at moments of user disconnect.
> I missed: did my patch fix panics in the ng_address_hook(), in this block?
>         if ((hook == NULL) ||   
>             NG_HOOK_NOT_VALID(hook) ||
>             NG_HOOK_NOT_VALID(peer = NG_HOOK_PEER(hook)) ||
>             NG_NODE_NOT_VALID(peernode = NG_PEER_NODE(hook))) {
>                 NG_FREE_ITEM(item);
>                 TRAP_ERROR();
>                 return (ENETDOWN);
>         }

It seems, yes. All my panics now are in _chkhook() being called
with bad hook as first argument.

> All the panics reported by you and Mike recently have traces unrelated
> to netgraph, and also traces look weird.

No, almost all my panics are related to netgraph, chains are like

ip_fastforward() - ng_rmnode_self() - ng_address_hook() - trap
sendto() - kern_sendit() - sosend_generic() - ng_parse_get_token() - ... - trap

Only one of my panics was unrelated to netgraph, with igmp_change_state() in trace.

> May be there is some kind of memory corruption? May be try memguard(9)?

I can try memguard too, please tell again what setting should I use.

One more thing: I've noticed my traced show there are plenty of recursive calls,
for example (from my letter of 07.02):

panic: page fault
cpuid = 1
KDB: stack backtrace:
X_db_sym_numargs() at 0xffffffff801a227a = X_db_sym_numargs+0x15a
kdb_backtrace() at 0xffffffff8033d547 = kdb_backtrace+0x37
panic() at 0xffffffff8030b567 = panic+0x187
dblfault_handler() at 0xffffffff804c0ca0 = dblfault_handler+0x330
dblfault_handler() at 0xffffffff804c107f = dblfault_handler+0x70f
trap() at 0xffffffff804c155f = trap+0x3df
calltrap() at 0xffffffff804a8de4 = calltrap+0x8
--- trap 0xc, rip = 0xffffffff803e4f36, rsp = 0xffffff80ebff7400, rbp = 0xffffff80ebff7430 ---
ng_parse_get_token() at 0xffffffff803e4f36 = ng_parse_get_token+0x6596
ng_parse_get_token() at 0xffffffff803e5ccf = ng_parse_get_token+0x732f
ng_destroy_hook() at 0xffffffff803d53b2 = ng_destroy_hook+0x222
ng_rmnode() at 0xffffffff803d6118 = ng_rmnode+0xa08
ng_snd_item() at 0xffffffff803d8520 = ng_snd_item+0x3f0
ng_destroy_hook() at 0xffffffff803d52ed = ng_destroy_hook+0x15d
ng_rmnode() at 0xffffffff803d57b9 = ng_rmnode+0xa9
ng_rmnode() at 0xffffffff803d7664 = ng_rmnode+0x1f54
ng_snd_item() at 0xffffffff803d8520 = ng_snd_item+0x3f0
ng_parse_get_token() at 0xffffffff803e97fa = ng_parse_get_token+0xae5a
sosend_generic() at 0xffffffff80373df6 = sosend_generic+0x436
kern_sendit() at 0xffffffff803776d5 = kern_sendit+0x1a5
kern_sendit() at 0xffffffff8037790c = kern_sendit+0x3dc
sendto() at 0xffffffff803779fd = sendto+0x4d
syscallenter() at 0xffffffff8034a015 = syscallenter+0x1e5
syscall() at 0xffffffff804c10fb = syscall+0x4b
Xfast_syscall() at 0xffffffff804a90c2 = Xfast_syscall+0xe2
--- syscall (133, FreeBSD ELF64, sendto), rip = 0x8018c971c, rsp = 0x7fffffbfeab8, rbp = 0x80203dcc0 ---
Uptime: 2d17h1m42s

Is it normal, is NETGRAPH protected from such execution flow?

Eugene Grosbein

More information about the freebsd-net mailing list