HEADS UP: inpcb/inpcbinfo rwlocking: coming to a 7-STABLE branch
near you
Robert Watson
rwatson at FreeBSD.org
Sat Aug 9 11:30:28 UTC 2008
On Sun, 3 Aug 2008, Robert Watson wrote:
> This is an advance warning that, late next week, I will be merging a fairly
> large set of changes to the IPv4 and IPv6 protocols layered over the
> inpcb/inpcbinfo kernel infrastructure. To be specific, this affects TCP,
> UDP, and raw sockets on both IPv4 and IPv6. I will post a further e-mail
> announcement along with patch set and schedule in a day or two once it's
> prepared.
Patches, which require the MFC of rwlock try-locking, which I did earlier
today:
http://www.watson.org/~robert/freebsd/netperf/20080808-7stable-rwlock-inpcb.diff
These incude the inpcb/inpcbinfo read/write locking changes (although not yet
for raw/divert sockets). Any testing, especially with heavy UDP loads, would
be much appreciated -- this are fairly complex changes, and also quite a
complex MFC.
Robert N M Watson
Computer Laboratory
University of Cambridge
>
> The thrust of this change is to replace the mutexes protecting the inpcb and
> inpcbinfo data structures with read-write locks (rwlocks). These structures
> represent, respectively, particular sockets and the global socket lists for
> all socket types in IPv4 and IPv6 except for SCTP. When you run netstat,
> inpcbinfo is the data structure referencing all connections, and each line in
> the nestat output reflects the contents of a specific inpcb.
>
> In the current stage of this work, the intent is to improve performance for
> datagram-related protocols on SMP systems by allowing concurrent acquisition
> of both global and connection locks during receive and transmit. This is
> possible because, in the common case, no connection or global state is
> modified during UDP/raw receive and transmit at the IP layer, so a read lock
> is sufficient to prevent data in those structures from unexpectedly changing.
> For receive, socket layer state is modified, but this is separately protected
> by socket layer locks. On transmit, no state is modified at any layer, so in
> principle we will allow fully parallel transmit from multiple threads down to
> about the routing and network interface layers, whereas previously they would
> bottleneck in UDP.
>
> The applications targeted by this change are threaded UDP server
> applications, such as BIND9, nsd, and UDP-based memcached. Kris Kennaway and
> Paul Saab have done fairly extensive testing with the changes and
> demonstrated significant performance improvements due to reduced contention
> and overhead. Perhaps they can mention some of those numbers in a follow-up
> to this post.
>
> The reason for the heads up is that, while carefully-tested, changes of this
> sort do come with risks. We've carefully structured them so as to avoid
> breaking the ABIs for netstat, etc, but it's not impossible that some
> problems will arise as the changes settle. The goal, however, is to see
> these performance improvements in 7.1, and since they've had a bit to shake
> out in 8.x and seen some heavy use, I think now is the right time to merge
> them.
>
> In any case, I will send out e-mail in a couple of days with a proposed merge
> patch and schedule for merging, and perhaps if you are in a positition where
> you might benefit from these improvements, or have interesting UDP or
> raw-socket based applications running on 7.x, you could test the candidate
> patch before it's merged, reporting any problems. Unless I receive negative
> feedback, I will plan on merging the changes late in the week, and keep a
> close eye on stable@ for any reports of problems.
>
> Thanks,
>
> Robert N M Watson
> Computer Laboratory
> University of Cambridge
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>
_______________________________________________
freebsd-stable at freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
More information about the freebsd-net
mailing list