Scalability problem from route refcounting
Kris Kennaway
kris at obsecurity.org
Thu Mar 15 01:15:12 UTC 2007
I have recently started looking at database performance over gigabit
ethernet, and there seems to be a bottleneck coming from the way route
reference counting is implemented. On an 8-core system it looks like
we spend a lot of time waiting for the rtentry mutex:
max total wait_total count avg wait_avg cnt_hold cnt_lock name
[...]
408 950496 1135994 301418 3 3 24876 55936 net/if_ethersubr.c:397 (sleep mutex:bge1)
974 968617 1515169 253772 3 5 14741 60581 dev/bge/if_bge.c:2949 (sleep mutex:bge1)
2415 18255976 1607511 253841 71 6 125174 3131 netinet/tcp_input.c:770 (sleep mutex:inp)
233 1850252 2080506 141817 13 14 0 126897 netinet/tcp_usrreq.c:756 (sleep mutex:inp)
384 6895050 2737492 299002 23 9 92100 73942 dev/bge/if_bge.c:3506 (sleep mutex:bge1)
626 5342286 2760193 301477 17 9 47616 54158 net/route.c:147 (sleep mutex:radix node head)
326 3562050 3381510 301477 11 11 133968 110104 net/route.c:197 (sleep mutex:rtentry)
146 947173 5173813 301477 3 17 44578 120961 net/route.c:1290 (sleep mutex:rtentry)
146 953718 5501119 301476 3 18 63285 121819 netinet/ip_output.c:610 (sleep mutex:rtentry)
50 4530645 7885304 1423098 3 5 642391 788230 kern/subr_turnstile.c:489 (spin mutex:turnstile chain)
i.e. during a 30 second sample we spend a total of >14 seconds (on all
cpus) waiting to acquire the rtentry mutex.
This appears to be because (among other things), we increment and then
decrement the route refcount for each packet we send, each of which
requires acquiring the rtentry mutex for that route before adjusting
the refcount. So multiplexing traffic for lots of connections over a
single route is being partly rate-limited by those mutex operations.
This is not the end of the story though, the bge driver is a serious
bottleneck on its own (e.g. I nulled out the route locking since it is
not relevant in my environment, at least for the purposes of this
test, and that exposed bge as the next problem -- but other drivers
may not be so bad).
Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-net/attachments/20070315/8de36653/attachment.pgp
More information about the freebsd-net
mailing list