Freebsd IP Forwarding performance (question, and some info)
[7-stable, current, em, smp]
Bruce Evans
brde at optusnet.com.au
Mon Jul 7 19:15:46 UTC 2008
On Tue, 8 Jul 2008, Bruce Evans wrote:
> On Mon, 7 Jul 2008, Andre Oppermann wrote:
>
>> Bruce Evans wrote:
>>> So it seems that the major overheads are not near the driver (as I already
>>> knew), and upper layers are responsible for most of the cache misses.
>>> The packet header is accessed even in monitor mode, so I think most of
>>> the cache misses in upper layers are not related to the packet header.
>>> Maybe they are due mainly to perfect non-locality for mbufs.
>>
>> Monitor mode doesn't access the payload packet header. It only looks
>> at the mbuf (which has a structure called mbuf packet header). The mbuf
>> header it hot in the cache because the driver just touched it and filled
>> in the information. The packet content (the payload) is cold and just
>> arrived via DMA in DRAM.
>
> Why does it use ntohs() then? :-). From if_ethersubr.c:
> ...
> % eh = mtod(m, struct ether_header *);
>
> Point outside of mbuf header.
>
> % etype = ntohs(eh->ether_type);
>
> First access outside of mbuf header.
> ...
> % % /* Allow monitor mode to claim this frame, after stats are updated.
> */
> % if (ifp->if_flags & IFF_MONITOR) {
> % m_freem(m);
> % return;
> % }
>
> Finally return in monitor mode.
>
> I don't see any stats update before here except for the stray if_imcasts
> one.
There are some error stats with printfs, but I've never seen these do
anything except with a buggy sk driver.
Testing verifies that accessing eh above gives a cache miss. Under
~5.2 receiving on bge0 at 397 kpps:
-monitor: 17% idle 19 cm/p (18% less idle than under -current)
monitor: 66% idle 8 cm/p (17% less idle than under -current)
+monitor: 71% idle 7 cm/p (idle time under -current not measured)
+monitor is monitor mode with the exit moved to the top of ether_input().
If the cache miss takes the time measured by lmbench2 (42 ns), then
397 k of these per second gives 17 ms or 1.7% CPU, which is vaguely
consistent with the improvement of 5% by not taking this cache miss.
Avoiding most of the 19 cache misses should give much more than a
5% improvement. Maybe -current gets its 17% improvement by avoiding
some.
More userland stats weirdness in userland:
- in monitor mode, em0 gives byte counts delayed while bge0 gives byte
counts always 0.
- netstat -I <interface> 1 seems to be broken in ~5.2 in all modes -- it
gives output for interfaces with drivers but no hardware.
All this is for UP. An SMP kernel on the same UP system loses < 5% for at
least tx.
Bruce
More information about the freebsd-net
mailing list