MONITOR/MWAIT question

Sat Jan 8 23:40:02 UTC 2011

:On Sat, Dec 18, 2010 at 02:24:34PM -0800, Matthew Dillon wrote:
:>     Does anyone know if an IRET cancels/triggers a MONITOR event?
:
:AMD's Architecture Programmer's Manual explicitly contains:
:
:Events that cause an exit from the monitor event pending state include:
:...
:- Any far control transfer that occurs between the MONITOR and the
:MWAIT.
:
:Joerg

    Yah.  The Intel documentation listed specific instructions and
    said something about a 'far call' but wasn't generic enough.  My
    AMD manuals are too old, I'm getting a new set.  The AMD manual
    using the 'any far control transfer' terminology implies that IRET
    is also covered.

    Another interesting question came up and that is whether a write
    on the same cpu that MONITOR was run on (without a far control
    transfer) can trigger a later MWAIT.  i.e. MONITOR addr, INCL addr,
    MWAIT addr, on the same cpu (that the MWAIT would then effectively
    be a NOP).  The MONITOR/MWAIT stuff apparently ties into the cpu's
    cache management architecture and a local write to a cache line which
    is already exclusive might not count, so I'm not sure if that case
    is covered.  I can't find a definitive answer so at some point I'll
    actually code something up and test it.

    It isn't a case which current uses trigger but I don't like question
    marks.

    Right now it looks like MONITOR/MWAIT works quite nicely with a
    pseudo-FIFO reservation model for handling cpu contention.  Basically
    you have a windex and a rindex.  You reserve a 'spot' using XADD on
    the windex and then resolve the cpu<->cpu contention with
    MONITOR/CMP/MWAIT's on rindex.  Only the owner of the rindex (when
    rindex matches the reserved index, which is exactly one cpu out of
    the N contending cpus) can increment rindex.  That way only *ONE* cpu
    at a time is trying to get the spin lock against the current lock
    holder instead of all the cpus contending with each other to try to
    get the spin lock from the current lock holder.

    Exponential backoff seems to fail horribly once you get over 8 cpus
    or so, but the pseudo-FIFO methodology seems to work well up to the
    maximum I've been able to test on (48 cpus).

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>