a proposed callout API
Robert Watson
rwatson at FreeBSD.org
Thu Nov 30 17:39:31 PST 2006
On Thu, 30 Nov 2006, Ivan Voras wrote:
> No trying to take sides here, but for us willing to learn here, what exactly
> are the problems in Matt Dillon's suggestions? From a novice's POV, having
> per-cpu queues looks (emphasis: looks) very scalable and performant.
The implications of adopting the model Matt proposes are quite far-reaching:
callouts don't exist in isolation, but occur in the context of data structures
and work occuring in many threads. If callouts are pinned to a particular
CPU, and can only be scheduled, rescheduled, and cancelled from that CPU, that
implies either that all work associated with that callout is also pinned to
the CPU, or that migration or message-passing be involved if the requirement
comes up in a thread on another CPU.
Consider the case of TCP timers: a number of TCP timers get regularly
rescheduled (delack, retransmit, etc). If they can only be manipulated from
cpu0 (i.e., protected by a synchronization primitive that can't be acquired
from another CPU -- i.e., critical sections instead of mutexes), how do you
handle the case where the a TCP packet for that connection is processed on
cpu1 and needs to change the scheduling of the timer? In a strict work/data
structure pinning model, you would pin the TCP connection to cpu0, and only
process any data leading to timer changes on that CPU. Alternatively, you
might pass a message from cpu1 to cpu0 to change the scheduling.
The idea of processing timers in multiple threads and pinning them to multiple
CPUs clearly isn't a bad idea: we could likely benefit from parallelism (and
generally, concurrency) in timer processing. One of the things we discussed
at the recent developer summit was subsystem callout threads (introducing the
opportunity for parallism without committing to a particular CPU scheduling
model), as well as per-CPU callout threads but protected using mutexes so that
reschedule/cancel/etc can be performed form other CPUs still. Changing the
API so that scheduling/rescheduling/etc activities themselves must occur on a
particular CPU has serious implications and commits us to an architectural
approach for which there is little concensus. If the goal is simply
parallelism, it's possible to accomplish that without embedding assumptions
about the synchronization model at this point. Take a look at the USENIX
paper by Paul Willmann (et al) at Rice for some rather interesting
experimentation, measurement, and discussion precisely along these lines:
http://www.ece.rice.edu/~willmann/pubs/paranet_tr06-872.pdf
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-arch
mailing list