IPFW update frequency
Julian Elischer
julian at elischer.org
Sat Mar 31 16:04:03 UTC 2007
Thanks for the information..
The main thrust for me is to make it not hold any locks during processing.
performance is 2nd
Andre Oppermann wrote:
> Julian Elischer wrote:
>> Luigi Rizzo wrote:
>>> On Fri, Mar 30, 2007 at 01:40:46PM -0700, Julian Elischer wrote:
>>>> I have been looking at the IPFW code recently, especially with
>>>> respect to locking.
>>>> There are some things that could be done to improve IPFW's behaviour
>>>> when processing packets, but some of these take a
>>>> toll (there is always a toll) on the 'updating' side of things.
>>>
>>> certainly ipfw was not designed with SMP in mind. If you can tell us
>>> what is your plan to make the list lock free
>>> (which one, the static or dynamic ones ?) maybe we can comment more.
>>>
>>> E.g. one option could be the usual trick of adding refcounts to
>>> the individual rules, and then using an array of pointers to them.
>>> While processing you grab a refcount to the array, and release it once
>>> done with the packet. If there is an addition or removal, you duplicate
>>> the array (which may be expensive for the large 20k rules mentioned),
>>> manipulate the copy and then atomically swap the pointers to the head.
>>
>> This is pretty close.. I know I've mentioned this to people several
>> times over
>> the last year or so. the trick is to try do it in a way that the
>> average packet
>> doesn't need to do any locks to get in and the updater does more work.
>> if you are willing to acquire a lock on both starting and ending
>> the run through the firewall it is easy.
>> (I already have code to do that..)
>> (see http://www.freebsd.org/~julian/atomic_replace.c (untested but
>> probably close.)
>> doing it without requiring that each packet get those locks however is
>> a whole new level of problem.
>
> The locking overhead per packet in ipfw is by no means its limiting
> factor. Actually it's a very small part and pretty much any work on
> it is lost love. It would be much better spent time to optimize the
> main rule loop of ipfw to speed things up. I was profiling ipfw early
> last year with an Agilent packet generator and hwpmc. In the meantime
> the packet forwarding path (w/o ipfw) has been improved but relative
> to each other the number are still correct.
>
> Numbers pre-taskqueue improvements from early 2006:
> fastfwd 580357 pps
> fastfwd+pfil_pass 565477 pps (no rules, just pass packet on)
> fastfwd+ipfw_allow 505952 pps (one rule)
> fastfwd+ipfw_30rules 401768 pps (30 IP address non-matching rules)
> fastfwd+pf_pass 476190 pps (one rule)
> fastfwd+pf_30rules 342262 pps (30 IP address non-matching rules)
>
> The overhead per packet is big. Enabling of ipfw and the pfil/ipfw
> per packet and their indirect function calls cause a loss of only
> about 15'000 pps (0.9%). On the other hand the first rule costs 12.9%
> and each additional rule 0.6%. All this is without any complex rules
> like table lookups, state tracking, etc.
>
> idle fastfwd fastfwd+ipfw_allow fastfwd+ipfw_30rules
> cycles 2596685731 2598214743 2597973265 2596702381
> cpu-clk-unhalted 7824023 2582240847 2518187670 2483904362
> instructions 2317535 1324655330 1492363346 2026009148
> branches 316786 174329367 191263118 294700024
> branch-mispredicts 19757 2235749 10003461 8848407
> dc-access 1417532 829159482 998427224 1235192770
> dc-refill-from-l2 2124 4767395 4346738 4548311
> dc-refill-from-system 89 803102 819658 654661
> dtlb-l2-hit 626 10435843 9304448 12352018
> dtlb-miss 129 255493 130998 112644
> ic-fetch 804423 471138619 583149432 870371492
> ic-miss 2358 34831 2505198 1947943
> itlb-l2-hit 0 74 12 12
> itlb-miss 42 92 82 82
> lock-cycles 77 803 352 451
> locked-instructions 4 19 2 4
> lock-dc-access 6 20 6 7
> lock-dc-miss 0 0 0 0
>
> Hardware is a dual Opteron 852 at 2.6GHz on a Tyan 2882 mainboard with
> a dual Intel em network card plugged into a PCI64-133 slot. Packets
> are flowing from em0 -> em1.
>
More information about the freebsd-net
mailing list