mpd5/Netgraph issues after upgrading to 7.4

Ryan Stone rysto32 at gmail.com
Mon Jul 9 20:25:34 UTC 2012


On Mon, Jul 9, 2012 at 4:12 AM, Gleb Smirnoff <glebius at freebsd.org> wrote:
> This looks very much related to a known race in ARP code.
>
> See this email and related thread:
>
> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031865.html
>
> Ryan didn't check in any patches since, and I failed to follow on this
> problem due to ENOTIME.
>
> I've added Ryan to Cc. Ryan, what's the status of the problem at your
> side? Did you come to any solution?

Unfortunately I was never able to come to a satisfactory solution.  As
I recall, in the end I ran headlong into problems with making the
locking sane.  The big problem was with arpresolve.  At one point it
calls callout_reset to schedule the LLE's la_timer.  In my patch this
would have to be done with a write lock help on the afdata lock.
However, this acquisition would have to be done before taking the
LLE_LOCK to prevent a LOR, and in the end you conclude that you have
to take a write lock on the ifnet's afdata lock for every packet that
goes through arpresolve, which was a non-starter.  That's the point
that I reached before I got distracted by other things at $WORK.

As I recall, the in6 case was even worse, as the in6 equivalent of
arptimer is significantly more complicated and likes to do crazy
things like dropping locks.


More information about the freebsd-net mailing list