what to replace splnet in FreeBSD 5.x?
Robert Watson
rwatson at FreeBSD.org
Tue Jul 12 15:30:19 GMT 2005
On Tue, 12 Jul 2005, Ed Maste wrote:
> On Sat, Jul 09, 2005 at 10:18:19AM -0700, Sam Leffler wrote:
>> spl's lock execution threads. 5.x and later systems mostly lock data
>> structures using mtx's (there are a very few exceptions). Thus there
>> isn't necessarily a direct replacement, you usually need to rethink your
>> locking/synchronization strategy.
>
> This brings up the issue of the remaining splnet()s in 5.x and -CURRENT.
> Grepping for "= splnet" in net/ and netinet/ shows more than 50 now
> no-op splnet()s left in the stack.
>
> We've run into corruption in the multicast address lists (in_multihead)
> on 5.x, and it turns out in_addmulti still has splnet() "protecting" the
> list.
>
> I'm not sure how many of the splnet()s are actually false positives
> (i.e. no longer relevant, locked in another way, etc.) but they're
> probably all good indicators of places that locking still needs to be
> revisited.
In many cases, the splnet's have been left in as references indicators of
earlier synchronization requirements and strategies. In some places, they
are signs of code still running with Giant over it (i.e., KAME IPSEC,
I4B). There are a number of areas of weakness in the current locking
work, and this includes:
- Several areas of the network stack that still require Giant to operate
correctly. Examples are KAME IPSEC (not FAST_IPSEC), some interactions
between the tty and network code, such as SLIP, and portions of the ATM
stack, and some of the edge case hardware drivers (i.e., older ISA
ethernet cards). When these components are present, some or all of the
network stack will run with Giant over it.
- Several areas where inadequate synchronization is present. Typically
they are associated with hard to exploit races, such as unicast address
configuration, and therefore generally don't result in instability (and
in most cases, we've actually done significant stability testing to make
sure they don't). Almost always, these races are around
administratively modified data structures.
- Several cases where undesirable synchronization is present. I.e., more
overhead than we'd like, don't match well with the data structures and
data management strategies, or don't interact well with the layering in
the network stack.
There is active work in all of these areas to remedy the problems. Some
are substantially better off in 6.x than 5.x; others will require
additional work.
I'm concerned about the multicast address list problems you've been
experiencing, but haven't yet had a chance to investigate. If you could
provide a code fragment that exercises this problem, that would probably
get me started a lot more quickly.
Thankms,
Robert N M Watson
More information about the freebsd-net
mailing list