[vnet] [epair] epair interface stops working after some time

Kristof Provost kristof at sigsegv.be
Thu Mar 29 13:09:17 UTC 2018


On 29 Mar 2018, at 14:48, Reshad Patuck wrote:
> pulling the 'net.link.epair.netisr_maxqlen' down does seem to make 
> this occur faster.
>Good, I think my hypothesis about where the issue lies is correct then.
You should be able to avoid (or at least reduce the frequency of) the 
issue by increasing the value on your system(s).

> When I dropped it to 2 like Kristof did and I have the same symptoms 
> on a box which was not exhibiting the problems manually began to have 
> the same symptoms.
> Bumping it back up to 2100 did not restore the functionality (I don't 
> know if it should).
>It’s good to know this. It doesn’t surprise me that it doesn’t fix 
things.
Something’s wrong in the code which handle an overflow of the netisr 
queue in the epair driver. Once that happens the IFF_DRV_OACTIVE flag 
gets set, and we keep enqueuing outside the netisr queue.
Somehow we never end up back in epair_nh_drainedcpu(), so the flag never 
gets cleared and the driver never recovers.

> I will create a PR for this later today with all the information I 
> have gathered so that we can have it all in one place.
>
Thanks. Please cc me on it. I’ll see if I can figure out what the 
problem is, but we might need someone smarter, so cc Bjoern too.

Regards,
Kristof



More information about the freebsd-net mailing list