dummynet dropping too many packets
rihad
rihad at mail.ru
Wed Oct 7 12:42:53 UTC 2009
Robert Watson wrote:
> Suggestions like increasing timer resolution are intended to spread out
> the injection of packets by dummynet to attempt to reduce the peaks of
> burstiness that occur when multiple queues inject packets in a burst
> that exceeds the queue depth supported by combined hardware descriptor
> rings and software transmit queue.
>
Raising HZ from 1000 to 2000 has helped. There are now 200-300 global
drops/s, as opposed to 300-1000 with HZ=1000. Or maybe net.isr.direct
from 1 to 0 help. Or maybe hash_size from 64 to 256. Or maybe...
> The two solutions, then are (a) to increase the timer resolution
> significantly so that packets are injected in smaller bursts
But isn't that bad that it can actually become worse? From /sys/conf/NOTES:
# The granularity of operation is controlled by the kernel option HZ whose
# default value (1000 on most architectures) means a granularity of 1ms
# (1s/HZ). Historically, the default was 100, but finer granularity is
# required for DUMMYNET and other systems on modern hardware. There are
# reasonable arguments that HZ should, in fact, be 100 still; consider,
# that reducing the granularity too much might cause excessive overhead in
# clock interrupt processing, potentially causing ticks to be missed and
thus
# actually reducing the accuracy of operation.
> and (b) increase the queue capacities. The hardware queue limits likely can't
> be raised w/o new hardware, but the ifnet transmit queue sizes can be
> increased.
Can someone please say how to increase the "ifnet transmit queue sizes"?
> Timer resolution going up is almost certainly not a bad idea in your configuration, although does require a reboot as you have observed.
>
OK, I'll try HZ=4000, but there are some required servers like
flowtools/radius/mysql/perl app that are also running.
> On a side note: one other possible interpretation of that statistic is
> that you're seeing fragmentation problems. Usually in forwarding
> scenarios this is unlikely. However, it wouldn't hurt to make sure you
> have LRO turned off on the network interfaces you're using, assuming
> it's supported by the driver.
>
I don't think fragments are the problem. The numbers are too small ;-)
$ netstat -s|fgrep fragment
5318 fragments received
147 fragments dropped (dup or out of space)
5157 fragments dropped after timeout
4088 output datagrams fragmented
8180 fragments created
0 datagrams that can't be fragmented
There's no such option as LRO shown, so I guess it's off:
options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>
More information about the freebsd-net
mailing list