Race between arptimer() and lle removal [WAS: panic in arptimer in r289937]

Alexander V. Chernikov melifaro at ipfw.ru
Fri Dec 11 10:13:01 UTC 2015


11.12.2015, 12:15, "Hans Petter Selasky" <hps at selasky.org>:
> Hi,
>
> Pulling the nail out of the haystack hopefully.
>
>>>  Any ideas on where next to look?
>
> Adrian: In your dump aswell I see:
>
> la_flags = 1
>
> That means there was a race calling arptimer() and removing the "lle".
Yes. The interesting part here is why lle is removed. There are quite a few reasons: either interface address deleted or interface going down, or explicit delete request.
That's why I asked Adrian about interface stuff (and haven't got a reply).
>
> Alexander: Can you comment on the following patch:
>
>  > Index: netinet/if_ether.c
>  > ===================================================================
>  > --- netinet/if_ether.c (revision 291256)
>  > +++ netinet/if_ether.c (working copy)
>  > @@ -185,7 +185,13 @@
>  > LLE_WUNLOCK(lle);
>  > return;
>  > }
>  > - ifp = lle->lle_tbl->llt_ifp;
>  > + if (lle->la_flags & LLE_LINKED) {
>  > + ifp = lle->lle_tbl->llt_ifp;
>  > + } else {
>  > + /* XXX RACE entry has been freed */
>  > + llentry_free(lle);
>  > + return;
>  > + }
>  > CURVNET_SET(ifp->if_vnet);
>  >
>  > if ((lle->la_flags & LLE_DELETED) == 0) {
>
> We need a check in arptimer() that the lle is still linked before
Yes, I had exactly that approach in mind. (And nd6_llinfo_timer() needs the same fix).
So, would you commit it or should I?
> proceeding, in there from what I can see. Because the callback is not
> protected by a mutex, it is not atomically stopped by callout_stop().
>
> --HPS


More information about the freebsd-net mailing list