MONITOR/MWAIT question
Matthew Dillon
dillon at apollo.backplane.com
Sat Jan 8 23:40:02 UTC 2011
:On Sat, Dec 18, 2010 at 02:24:34PM -0800, Matthew Dillon wrote:
:> Does anyone know if an IRET cancels/triggers a MONITOR event?
:
:AMD's Architecture Programmer's Manual explicitly contains:
:
:Events that cause an exit from the monitor event pending state include:
:...
:- Any far control transfer that occurs between the MONITOR and the
:MWAIT.
:
:Joerg
Yah. The Intel documentation listed specific instructions and
said something about a 'far call' but wasn't generic enough. My
AMD manuals are too old, I'm getting a new set. The AMD manual
using the 'any far control transfer' terminology implies that IRET
is also covered.
Another interesting question came up and that is whether a write
on the same cpu that MONITOR was run on (without a far control
transfer) can trigger a later MWAIT. i.e. MONITOR addr, INCL addr,
MWAIT addr, on the same cpu (that the MWAIT would then effectively
be a NOP). The MONITOR/MWAIT stuff apparently ties into the cpu's
cache management architecture and a local write to a cache line which
is already exclusive might not count, so I'm not sure if that case
is covered. I can't find a definitive answer so at some point I'll
actually code something up and test it.
It isn't a case which current uses trigger but I don't like question
marks.
Right now it looks like MONITOR/MWAIT works quite nicely with a
pseudo-FIFO reservation model for handling cpu contention. Basically
you have a windex and a rindex. You reserve a 'spot' using XADD on
the windex and then resolve the cpu<->cpu contention with
MONITOR/CMP/MWAIT's on rindex. Only the owner of the rindex (when
rindex matches the reserved index, which is exactly one cpu out of
the N contending cpus) can increment rindex. That way only *ONE* cpu
at a time is trying to get the spin lock against the current lock
holder instead of all the cpus contending with each other to try to
get the spin lock from the current lock holder.
Exponential backoff seems to fail horribly once you get over 8 cpus
or so, but the pseudo-FIFO methodology seems to work well up to the
maximum I've been able to test on (48 cpus).
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the freebsd-hackers
mailing list