Network stack changes
George Neville-Neil
gnn at neville-neil.com
Fri Sep 13 15:08:25 UTC 2013
On Aug 29, 2013, at 7:49 , Adrian Chadd <adrian at freebsd.org> wrote:
> Hi,
>
> There's a lot of good stuff to review here, thanks!
>
> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to keep
> locking things like that on a per-packet basis. We should be able to do
> this in a cleaner way - we can defer RX into a CPU pinned taskqueue and
> convert the interrupt handler to a fast handler that just schedules that
> taskqueue. We can ignore the ithread entirely here.
>
> What do you think?
>
> Totally pie in the sky handwaving at this point:
>
> * create an array of mbuf pointers for completed mbufs;
> * populate the mbuf array;
> * pass the array up to ether_demux().
>
> For vlan handling, it may end up populating its own list of mbufs to push
> up to ether_demux(). So maybe we should extend the API to have a bitmap of
> packets to actually handle from the array, so we can pass up a larger array
> of mbufs, note which ones are for the destination and then the upcall can
> mark which frames its consumed.
>
> I specifically wonder how much work/benefit we may see by doing:
>
> * batching packets into lists so various steps can batch process things
> rather than run to completion;
> * batching the processing of a list of frames under a single lock instance
> - eg, if the forwarding code could do the forwarding lookup for 'n' packets
> under a single lock, then pass that list of frames up to inet_pfil_hook()
> to do the work under one lock, etc, etc.
>
> Here, the processing would look less like "grab lock and process to
> completion" and more like "mark and sweep" - ie, we have a list of frames
> that we mark as needing processing and mark as having been processed at
> each layer, so we know where to next dispatch them.
>
One quick note here. Every time you increase batching you may increase bandwidth
but you will also increase per packet latency for the last packet in a batch.
That is fine so long as we remember that and that this is a tuning knob
to balance the two.
> I still have some tool coding to do with PMC before I even think about
> tinkering with this as I'd like to measure stuff like per-packet latency as
> well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P /
> lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.)
>
This would be very useful in identifying the actual hot spots, and would be helpful
to anyone who can generate a decent stream of packets with, say, an IXIA.
Best,
George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.freebsd.org/pipermail/freebsd-net/attachments/20130913/5ff29edd/attachment.sig>
More information about the freebsd-net
mailing list