seems I finally found what upset kqemu on amd64 SMP... shared
gdt! (please test patch :)
Bruce Evans
brde at optusnet.com.au
Sat May 10 12:29:08 UTC 2008
On Sat, 10 May 2008, Juergen Lock wrote:
> On Thu, May 08, 2008 at 09:59:57PM +1000, Bruce Evans wrote:
>> The message in npx.c is actually about violation of an even more
>> fundamental invariant -- the invariant that owning the FPU includes
>> having the TS flag clear so that DNA traps cannot occur. The bug in
>> kqemu seems to be mismanagement of the TS flag related to this. I
>> forget if it is the host or the target TS flag that seems to be mismanaged.
>> For the target, it would take a bug in the virtualization of the TS flag
>> to break this invariant (assuming no related bugs in the target kernel).
>>
> Well the `fpcurthread == curthread' bug has been fixed quite a while
> ago already, or do you mean another one?
I didn't know what is already fixed.
>> The message in amd64/machdep.c is about violation of the invariant
>> that the kernel cannot cause DNA traps. Spurious DNA traps in the
>> ...
>>
> Okay I _think_ I know a little more about this now... kqemu itself
> doesn't use the fpu, but the guest code it runs can, and in that case the
> DNA trap is just used for (host) lazy fpu context switching like as if the
> code was running in userland regularly. And I just tested the following
> patch that should get rid of the message by calling fpudna/npxdna directly
> (files/patch-fpucontext is the interesting part:)
This seems reasonable. Is the following summary of my understanding of
kqemu's implementation of this and your change correct?:
- kqemu runs in kernel mode on the host and needs to have exactly the
same effect as a DNA exception on the target.
- having exactly the same effect requires calling the host DNA exception
handler.
- now it uses a software int $7 (dna) to implement the above, but this is
not permitted in kernel mode (although the software int could be permitted,
it is hard to distinguish from a hardware exception for unintentional use).
- your change makes it call the DNA trap handler directly. This gives the
same effect as a permitted software int $7. It is also faster.
It would be better to use an official API for this, but none exists.
> ...
> +Index: kqemu-freebsd.c
> +@@ -33,6 +33,11 @@
> +
> + #include <machine/vmparam.h>
> + #include <machine/stdarg.h>
> ++#ifdef __x86_64__
> ++#include <machine/fpu.h>
> ++#else
> ++#include <machine/npx.h>
> ++#endif
> +
> + #include "kqemu-kernel.h"
> +
> +@@ -172,6 +177,15 @@
> + {
> + }
> +
> ++void CDECL kqemu_loadfpucontext(unsigned long cpl)
> ++{
> ++#ifdef __x86_64__
> ++ fpudna();
> ++#else
> ++ npxdna();
> ++#endif
> ++}
Just be sure that the system state is not too different from that of
trap() (directly below a syscall or trap from userland) when this is
called. Better not have any interrupts disabled or locks held, though
I think npxdna() doesn't care. The FPU must not be owned already at
this point.
> ++
> + #if __FreeBSD_version < 500000
> + static int
> + curpriority_cmp(struct proc *p)
I guess kqemu duplicates this old mistake instead of calling it because it
is static. npxdna() is already public so it can be abused easily :-),
Bruce
More information about the freebsd-emulation
mailing list