Freebsd IP Forwarding performance (question, and some info)
[7-stable, current, em, smp]
Robert Watson
rwatson at FreeBSD.org
Mon Jul 7 13:39:00 UTC 2008
On Mon, 7 Jul 2008, Bruce Evans wrote:
>> (1) sendto() to a specific address and port on a socket that has been bound
>> to
>> INADDR_ANY and a specific port.
>>
>> (2) sendto() on a specific address and port on a socket that has been bound
>> to
>> a specific IP address (not INADDR_ANY) and a specific port.
>>
>> (3) send() on a socket that has been connect()'d to a specific IP address
>> and
>> a specific port, and bound to INADDR_ANY and a specific port.
>>
>> (4) send() on a socket that has been connect()'d to a specific IP address
>> and a specific port, and bound to a specific IP address (not INADDR_ANY)
>> and a specific port.
>>
>> The last of these should really be quite a bit faster than the first of
>> these, but I'd be interested in seeing specific measurements for each if
>> that's possible!
>
> Not sure if I understand networking well enough to set these up quickly.
> Does netrate use one of (3) or (4) now?
(3) and (4) are effectively the same thing, I think, since connect(2) should
force the selection of a source IP address, but I think it's not a bad idea to
confirm that. :-)
The structure of the desired micro-benchmark here is basically:
int
main(int argc, char *argv)
{
struct sockaddr_in sin;
/* Parse command line arguments such as addresss and ports. */
if (bind_desired) {
/* Set up sockaddr_in. */
if (bind(s, (struct sockaddr *)&sin, sizeof(sin)) < 0)
err(-1, "bind");
}
/* Set up destination sockaddr_in. */
if (connect_desired) {
if (connect(s, (struct sockaddr *)&sin, sizeof(sin)) < 0)
err(-1, "connect");
}
while (appropriate_condition) {
if (connect_desired) {
if (send(s, ...) < 0)
errors++;
} else {
if (sendto(s, (struct sockaddr *)&sin, sizeof(sin)) < 0)
errors++;
}
}
}
> I can tell you vaguely about old results for netrate (send()) vs ttcp
> (sendto()). send() is lighter weight of course, and this made a difference
> of 10-20%, but after further tuning the difference became smaller, which
> suggests that everything ends up waiting for something in common.
>
> Now I can measure cache misses better and hope that a simple count of cache
> misses will be a more reproducible indicator of significant bottlenecks than
> pps. I got nowhere trying to reduce instruction counts, possibly because it
> would take avoiding 100's of instructions to get the same benefit as
> avoiding a single cache miss.
If you look at the design of the higher performance UDP applications, they
will generally bind a specific IP (perhaps every IP on the host with its own
socket), and if they do sustained communication to a specific endpoint they
will use connect(2) rather than providing an address for each send(2) system
call to the kernel.
udp_output(2) makes the trade-offs there fairly clear: with the most recent
rev, the optimal case is one connect(2) has been called, allowing a single
inpcb read lock and no global data structure access, vs. an application
calling sendto(2) for each system call and the local binding remaining
INADDR_ANY. Middle ground applications, such as named(8) will force a local
binding using bind(2), but then still have to pass an address to each
sendto(2). In the future, this case will be further optimized in our code by
using a global read lock rather than a global write lock: we have to check for
collisions, but we don't actually have to reserve the new 4-tuple for the UDP
socket as it's an ephemeral association rather than a connect(2).
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-net
mailing list