HEADS UP: inpcb/inpcbinfo rwlocking: coming to a 7-STABLE
branch near you
Mike Tancsa
mike at sentex.net
Mon Aug 18 13:37:55 UTC 2008
At 04:14 AM 8/18/2008, Robert Watson wrote:
>On Sun, 3 Aug 2008, Robert Watson wrote:
>
>>This is an advance warning that, late next week, I will be merging
>>a fairly large set of changes to the IPv4 and IPv6 protocols
>>layered over the inpcb/inpcbinfo kernel infrastructure. To be
>>specific, this affects TCP, UDP, and raw sockets on both IPv4 and
>>IPv6. I will post a further e-mail announcement along with patch
>>set and schedule in a day or two once it's prepared.
>
>FYI: This patch has now been committed to Subversion. I'll keep a
>close eye out for difficulties; if you run into issues, please send
>me an e-mail (and CC stable@).
Hi Robert,
I just did a buildworld/kernel in case your commit fixed the
routing bugs, but I am still seeing those bogus arp / routing table
entries. I narrowed it down to the commits below. I dont think its
the intel stuff, as another user reported the same issue using bce nics.
date=2008.07.30.18.00.00
and
date=2008.07.31.00.00.00
Updating collection src-all/cvs
Edit src/sys/conf/files
Add delta 1.1243.2.32 2008.07.30.20.35.41 kmacy
Checkout src/sys/dev/e1000/LICENSE
Checkout src/sys/dev/e1000/README
Checkout src/sys/dev/e1000/e1000_80003es2lan.c
Checkout src/sys/dev/e1000/e1000_80003es2lan.h
Checkout src/sys/dev/e1000/e1000_82540.c
Checkout src/sys/dev/e1000/e1000_82541.c
Checkout src/sys/dev/e1000/e1000_82541.h
Checkout src/sys/dev/e1000/e1000_82542.c
Checkout src/sys/dev/e1000/e1000_82543.c
Checkout src/sys/dev/e1000/e1000_82543.h
Checkout src/sys/dev/e1000/e1000_82571.c
Checkout src/sys/dev/e1000/e1000_82571.h
Checkout src/sys/dev/e1000/e1000_82575.c
Checkout src/sys/dev/e1000/e1000_82575.h
Checkout src/sys/dev/e1000/e1000_api.c
Checkout src/sys/dev/e1000/e1000_api.h
Checkout src/sys/dev/e1000/e1000_defines.h
Checkout src/sys/dev/e1000/e1000_hw.h
Checkout src/sys/dev/e1000/e1000_ich8lan.c
Checkout src/sys/dev/e1000/e1000_ich8lan.h
Checkout src/sys/dev/e1000/e1000_mac.c
Checkout src/sys/dev/e1000/e1000_mac.h
Checkout src/sys/dev/e1000/e1000_manage.c
Checkout src/sys/dev/e1000/e1000_manage.h
Checkout src/sys/dev/e1000/e1000_nvm.c
Checkout src/sys/dev/e1000/e1000_nvm.h
Checkout src/sys/dev/e1000/e1000_osdep.c
Checkout src/sys/dev/e1000/e1000_osdep.h
Checkout src/sys/dev/e1000/e1000_phy.c
Checkout src/sys/dev/e1000/e1000_phy.h
Checkout src/sys/dev/e1000/e1000_regs.h
Checkout src/sys/dev/e1000/if_em.c
Checkout src/sys/dev/e1000/if_em.h
Checkout src/sys/dev/e1000/if_igb.h
Edit src/sys/kern/kern_synch.c
Add delta 1.302.2.3 2008.07.30.18.28.09 rwatson
Edit src/sys/kern/sys_process.c
Add delta 1.145.2.1 2008.07.30.19.49.10 jhb
Edit src/sys/netinet/tcp_subr.c
Add delta 1.300.2.4 2008.07.30.20.35.41 kmacy
Edit src/sys/netinet/tcp_syncache.c
Add delta 1.130.2.9 2008.07.30.20.35.41 kmacy
Add delta 1.130.2.10 2008.07.30.20.51.20 kmacy
Edit src/sys/netinet/tcp_syncache.h
Add delta 1.1.2.1 2008.07.30.20.35.41 kmacy
Edit src/sys/netinet/tcp_usrreq.c
Add delta 1.163.2.4 2008.07.30.20.35.41 kmacy
Edit src/sys/netinet/udp_usrreq.c
Add delta 1.218.2.1 2008.07.30.21.23.21 bz
Edit src/sys/netinet6/ip6_input.c
Add delta 1.95.2.1 2008.07.30.21.23.21 bz
Edit src/sys/netinet6/ip6_var.h
Add delta 1.39.2.2 2008.07.30.21.23.21 bz
Edit src/sys/sys/socket.h
Add delta 1.95.2.3 2008.07.30.19.35.40 kmacy
Edit src/sys/ufs/ufs/ufs_lookup.c
Add delta 1.83.2.2 2008.07.30.21.43.42 jhb
Edit src/sys/vm/vm_object.c
Add delta 1.385.2.2 2008.07.30.21.43.42 jhb
Edit src/sys/vm/vm_object.h
Add delta 1.114.2.1 2008.07.30.21.43.42 jhb
Edit src/sys/vm/vnode_pager.c
Add delta 1.236.2.2 2008.07.30.21.43.42 jhb
---Mike
>Thanks,
>
>Robert N M Watson
>Computer Laboratory
>University of Cambridge
>
>>
>>The thrust of this change is to replace the mutexes protecting the
>>inpcb and inpcbinfo data structures with read-write locks
>>(rwlocks). These structures represent, respectively, particular
>>sockets and the global socket lists for all socket types in IPv4
>>and IPv6 except for SCTP. When you run netstat, inpcbinfo is the
>>data structure referencing all connections, and each line in the
>>nestat output reflects the contents of a specific inpcb.
>>
>>In the current stage of this work, the intent is to improve
>>performance for datagram-related protocols on SMP systems by
>>allowing concurrent acquisition of both global and connection locks
>>during receive and transmit. This is possible because, in the
>>common case, no connection or global state is modified during
>>UDP/raw receive and transmit at the IP layer, so a read lock is
>>sufficient to prevent data in those structures from unexpectedly
>>changing. For receive, socket layer state is modified, but this is
>>separately protected by socket layer locks. On transmit, no state
>>is modified at any layer, so in principle we will allow fully
>>parallel transmit from multiple threads down to about the routing
>>and network interface layers, whereas previously they would bottleneck in UDP.
>>
>>The applications targeted by this change are threaded UDP server
>>applications, such as BIND9, nsd, and UDP-based memcached. Kris
>>Kennaway and Paul Saab have done fairly extensive testing with the
>>changes and demonstrated significant performance improvements due
>>to reduced contention and overhead. Perhaps they can mention some
>>of those numbers in a follow-up to this post.
>>
>>The reason for the heads up is that, while carefully-tested,
>>changes of this sort do come with risks. We've carefully
>>structured them so as to avoid breaking the ABIs for netstat, etc,
>>but it's not impossible that some problems will arise as the
>>changes settle. The goal, however, is to see these performance
>>improvements in 7.1, and since they've had a bit to shake out in
>>8.x and seen some heavy use, I think now is the right time to merge them.
>>
>>In any case, I will send out e-mail in a couple of days with a
>>proposed merge patch and schedule for merging, and perhaps if you
>>are in a positition where you might benefit from these
>>improvements, or have interesting UDP or raw-socket based
>>applications running on 7.x, you could test the candidate patch
>>before it's merged, reporting any problems. Unless I receive
>>negative feedback, I will plan on merging the changes late in the
>>week, and keep a close eye on stable@ for any reports of problems.
>>
>>Thanks,
>>
>>Robert N M Watson
>>Computer Laboratory
>>University of Cambridge
>>_______________________________________________
>>freebsd-stable at freebsd.org mailing list
>>http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
>_______________________________________________
>freebsd-stable at freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
More information about the freebsd-stable
mailing list