Gigabit ethernet questions?
Robert Watson
rwatson at FreeBSD.org
Fri Aug 11 10:07:48 UTC 2006
On Wed, 9 Aug 2006, Dima Roshin wrote:
> Greeting colleagues. I've got two DL-360(pciX bus) servers, with BCM5704
> NetXtreme Dual Gigabit Adapters(bge). The Uname is 6.1-RELEASE-p3. The bge
> interfaces of the both servers are connected with each other with a cat6
> patchcord.
On any recent box, or even somewhat older ones, achieving gigabit speeds with
decent frame sizes (~1500 or greater) should be trivial. Using two dual-Xeons
in the netperf cluster (of similar configuration), without any
optimization/configuration at all (not even default TCP buffer size changes),
and likely with at least some debugging compiled in, I got 930mbps on the
netperf TCP stream test.
So if you're not getting that speed, you need to look at the configuration
closely. In particular, I would...
(0) Make sure expensive debugging features, such as WITNESS and INVARIANTS,
are disabled. Make sure netperf is not compiled with -DHISTOGRAM, which
is not the default (anymore).
(1) Confirm that your cabling is all good, and probably replace the cable to
be sure. Remove the switch from the loop to make sure it's not a switch
problem.
(2) Disable polling. One problem I've observed with polling is that you must
poll at a very high rate on high speed links, or the polling rate is too
small for the buffer on the ethernet card, so packets are dropped because
it's drained too infrequently. With interrupt moderation on modern cards,
you get significantly polling-like effects anyway, and to be honest,
syncing or sourcing a gigabit on a decent box should work fine without
special stack optimizations. I don't remember the numbers, but you may
find that to make polling reliable, you need to further increase HZ.
(3) Check for interrupt problems. Make sure that the receive interrupt rate
isn't firing significantly faster than desired, that there's no interrupt
shadowing to other interrupt handlers, etc. Also confirm that all PCI
segments used for gigabit networking are 64-bit, not 32-bit. I believe
PCI-X should be fine.
(4) Use top -S and vmstat -systat 1 to characerize the system load during a
test run (ideally start it running when the test starts, and capture the
output 60 seconds in or so), as this is valuable debugging information
that will help us decide what the source of the problem is.
Where gigabit gets tricky is with small frame sizes, where the per-packet cost
dominates -- i.e., 0-byte UDP payloads. At large frame sizes, raw TCP
shouldn't be a problem.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-net
mailing list