New cpu_switch() and cpu_throw().
Jeff Roberson
jroberson at chesapeake.net
Tue Jun 5 05:33:04 UTC 2007
For every architecture we need to support a new features in cpu_switch()
and cpu_throw() before they can support per-cpu schedlock. I'll describe
those below. I'm soliciting help or advice in implementing these on
platforms other than x86, and amd64, especially on ia64 where things are
implemented in C!
I checked in the new version of cpu_switch() for amd64 today after
threadlock went in. Basically, we have to release a thread's lock when
it's switched out and acquire a lock when it's switched in.
The release must happen after we're totally done with the stack and
vmspace of the thread to be switched out. On amd64 this meant after we
clear the active bits for tlb shootdown. The release actually makes use
of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to
this argument rather than unlocking a real lock. td_lock has previously
been set to the blocked lock, which is always blocked. Threads
spinning in thread_lock() will notice the td_lock pointer change and
acquire the new lock. So this is simple, just a non-atomic store with a
pointer passed as an argument. On amd64:
movq %rdx, TD_LOCK(%rdi) /* Release the old thread */
The acquire part is slightly more complicated and involves a little loop.
We don't actually have to spin trying to lock the thread. We just spin
until it's no longer set to the blocked lock. The switching thread
already owns the per-cpu scheduler lock for the current cpu. If we're
switching into a thread that is set to the blocked_lock another cpu is
about to set it to our current cpu's lock via the mtx argument mentioned
above. On amd64 we have:
/* Wait for the new thread to become unblocked */
movq $blocked_lock, %rdx
1:
movq TD_LOCK(%rsi),%rcx
cmpq %rcx, %rdx
je 1b
So these two are actually quite simple. You can see the full patch for
cpu_switch.S as the first file modified in:
http://people.freebsd.org/~jeff/threadlock.diff
For cpu_throw() we have to actually complete a real unlock of a spinlock.
What happens here, although this isn't in cvs yet, is that thread_exit()
will set the thread's lock pointer to be the per-process spinlock. This
spinlock must be unlocked so that process resources can't be reclaimed
by wait while a thread is executing cpu_throw(). This code on amd64 is
(from memory rather than a patch):
movq $MTX_UNOWNED, %rdx
movq TD_LOCK(%rsi), %rsi
xchgq %rdx, MTX_LOCK(%rsi)
I'm hoping to have at least the cpu_throw() part done for every
architecture for 7.0. This will enable me to simplify thread_exit() and
not have a lot of per-scheduler/architecture workarounds. Without the
cpu_switch() parts sched_4bsd will still work on an architecture.
I have a per-cpu spinlock version of ULE which may replace ULE or exist
along side it as sched_smp. This will only work on architectures that
implement the new cpu_throw() and cpu_switch().
Consider this an official call for help with the architectures you
maintain. Please let me know also if you maintain an arch that you don't
mind to have temporarily broken until you implement this.
Thanks,
Jeff
More information about the freebsd-arch
mailing list