[Fwd: Re: rtfree: 0xc5caad98 has 2 refs]
Per olof Ljungmark
peo at intersonic.se
Fri Dec 28 14:09:29 PST 2007
Stefan Lambrev wrote:
> Hi,
>
> Can you replace all calls to rtfree() with RTFREE_LOCKED() in those files:
>
> netinet/if_ether.c
> netinet6/nd6_nbr.c
> netinet6/in6_ifattach.c
> netinet6/in6_gif.c
>
> Of course do not forget net/route.c with the patch from the PR.
> Recompile the kernel and check if this will cure your hangs?
>
> I'm not sure about the lock order reversal, may be it was introduced
> with kbd_backtrace().
> You can remove it from route.c, replace rtfree() and build kernel with
> debug, to see if the LOR is gone.
>
> It seems that the panic is caused by rtalloc1() called in route.c line
> 333 :
> rt = rtalloc1(dst, 0, 0UL); /* NB: rt is locked */
>
> most probably because rt is not locked :)
> I'm out of ideas how to check if it is really locked, but you can
> experiment with RT_LOCK() and RT_UNLOCK().
> May be mtx_trylock() can help too.
>
> Please share your findings with -net & -current if you did not before.
>
> =cut=
Unfortunately I ran out of time before I could complete the test.
However, I can report one more interesting finding from today: The icmp
packets that triggers the bug probably comes either from a Cisco router
or the setup itself.
Late today our network topology was changed,
Previous setup:
affected hosts ISP's router (default gw)
.1
LAN ------------ router-------- wlan 1 (via ISP)
| 192.168.3.0
our firewall .254 |
fw ----------wlan 2
| 172.16.2.0 (isakmpd)
|
Internet
Current setup:
affected hosts our fw (OpenBSD)
.1 192.168.3.0
LAN ------------ router------ wlan 1 (isakmpd)
|
| 172.16.2.0 (isakmpd)
| --------wlan 2
|
|
Internet
and this "fixed" the problem!
We have no access to the Cisco so I don't know it's configuration. But:
No lockups, no "rtfree" messages.
If the bug is still unresolved mid-January I can continue testing by
then. Thanks to all for your suggestions and help!
--per
More information about the freebsd-net
mailing list