amd64/169927: siginfo,
si_code for fpe errors when error occurs using the SSE math
processor
Konstantin Belousov
kostikbel at gmail.com
Wed Jul 18 03:52:33 UTC 2012
On Wed, Jul 18, 2012 at 10:32:23AM +1000, Bruce Evans wrote:
> On Tue, 17 Jul 2012, Mark Linimon wrote:
>
> >The following reply was made to PR amd64/169927; it has been noted by
> >GNATS.
>
> I'm replying to Mark's forwarding of this, but gnats isn't in the Cc list
> so it will not be noted by gnats, again.
>
> >On Wed, Jul 18, 2012 at 02:03:58AM +1000, Bruce Evans wrote:
> >> Apart from doing the bogus fnclex for T_XMMFLT and the delayed effect of
> >> i387 status bits, merging or not merging the statuses makes little
> >> difference, since if a status bit is set and is not masked according
> >> to its control word, then it will generate a trap soon if it didn't
> >> genearate the current one.
> >
> >The trap number is available for SA_SIGINFO type of handlers with
> >si_trapno member of siginfo_t. I think this is final argument to have
> >separate fputrap_{x87,sse} functions.
>
> OK. Also, you can tell instruction that faulted, provided there are
> no spurious faults. I think spurious faults can only occur for IRQ13
> exception handling which was never supported for amd64 and is no longer
> supported for i386. My old test program that is broken by not clobbering
> the status was mainly for finding and fixing such spurious faults.
> I never owned an i387 or i486SX and had to break i486 exception handling
> to test IRQ13.
486SX+487 never used irq13 as well, since 487 was the 'full' DX package,
which turned off SX socket.
...
> >+
> >+ /*
> >+ * Coomparing with the x87 #MF handler, we do not clear
> >+ * exceptions from the mxcsr.
> >+ */
>
> Spelling error.
>
> This is already covered in too much detail in the big comment before
> the functions. Don't repeat it here.
Removed.
>
>
> >+ if (PCPU_GET(fpcurthread) != curthread)
> >+ mxcsr = curthread->td_pcb->pcb_save->sv_env.en_mxcsr;
> >+ else
> >+ stmxcsr(&mxcsr);
> >+
> >+ critical_exit();
> >+
>
> Style bug (extra blank line). The others are necessary, but become
> unnecessary after removing the comment.
>
> >+ return (fpetable[(mxcsr & (mxcsr >> 16)) & 0x3f]);
> >+}
> >+
> > /*
> > * Implement device not available (DNA) exception
> > *
>
> Cleaning this up gives code that is small enogh for me:
>
> int
> fputrap_sse(void)
> {
> u_int mxcsr;
>
> critical_enter();
> if (PCPU_GET(fpcurthread) != curthread)
> mxcsr = curthread->td_pcb->pcb_save->sv_env.en_mxcsr;
> else
> stmxcsr(&mxcsr);
> critical_exit();
> return (fpetable[(mxcsr & (mxcsr >> 16)) & 0x3f]);
> }
Ok.
>
> I forgot to replace your patch by the newer one before replying.
>
> Just noticed another style/efficiency bug: this should use curpcb,
> not curthread->td_pcb. I don't like the curpcb global, but as
> long as it exists, it should be used. Every time I looked at
> removing this global, it seemed better to keep it. C code can
> easily use curthread->td_pcb instead of curpcb, and this is faster
> if curthread is cached. But CURPCB is easier to use in asm.
I changed to use curpcb (its use is indeed mainly for asm context switch
code). I do want to use pcb_save though, see below.
>
> Similarly in fputrap_x87().
>
> Not similarly for i386. i386 still uses the GET_FPU_CW/SW() macros,
> and these take a thread arg (which is named badly as `thread' instead
> of the normal td) and can't use curpcb. The cleanup here goes with
> your removal of these macros.
>
> I now quote your newer patch for the fputrap_x87() cleanup:
>
> > @@ -531,8 +534,9 @@ static char fpetable[128] = {
> > * solution for signals other than SIGFPE.
> > */
> > int
> > -fputrap()
> > +fputrap_x87(void)
> > {
> > + struct savefpu *pcb_save;
>
> No need for this before or after. Let the compiler cache it. Just use
> curpcb now.
The curpcb access shall be spelled as PCPU_GET(curpcb). Please note that
compiler is not allowed to cache the accesses, since there is __asm
__volatile expression to indirect through %gs-prefixed read.
>
> > u_short control, status;
> >
> > critical_enter();
> > @@ -543,19 +547,40 @@ fputrap()
> > * wherever they are.
> > */
> > if (PCPU_GET(fpcurthread) != curthread) {
> > - control = GET_FPU_CW(curthread);
> > - status = GET_FPU_SW(curthread);
> > + pcb_save = curthread->td_pcb->pcb_save;
> > + control = pcb_save->sv_env.en_cw;
> > + status = pcb_save->sv_env.en_sw;
>
> Change to:
>
> control = curpcb->pcb_save->sv_env.en_cw;
> status = curpcb->pcb_save->sv_env.en_sw;
>
> ...
>
> i386 used curpcb near here in FreeBSD-3, but was changed to use a macro
> in FreeBSD-4. However, the macro still used curpcb, since it was a
> special one that was only used in npx_intr() and thus didn't need to
> start with a general td arg.
diff --git a/sys/amd64/amd64/fpu.c b/sys/amd64/amd64/fpu.c
index a7812b7..f88aba7 100644
--- a/sys/amd64/amd64/fpu.c
+++ b/sys/amd64/amd64/fpu.c
@@ -73,6 +73,7 @@ __FBSDID("$FreeBSD$");
#define fxrstor(addr) __asm __volatile("fxrstor %0" : : "m" (*(addr)))
#define fxsave(addr) __asm __volatile("fxsave %0" : "=m" (*(addr)))
#define ldmxcsr(csr) __asm __volatile("ldmxcsr %0" : : "m" (csr))
+#define stmxcsr(addr) __asm __volatile("stmxcsr %0" : : "m" (*(addr)))
static __inline void
xrstor(char *addr, uint64_t mask)
@@ -105,6 +106,7 @@ void fnstsw(caddr_t addr);
void fxsave(caddr_t addr);
void fxrstor(caddr_t addr);
void ldmxcsr(u_int csr);
+void stmxcsr(u_int csr);
void xrstor(char *addr, uint64_t mask);
void xsave(char *addr, uint64_t mask);
@@ -113,9 +115,6 @@ void xsave(char *addr, uint64_t mask);
#define start_emulating() load_cr0(rcr0() | CR0_TS)
#define stop_emulating() clts()
-#define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
-#define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
-
CTASSERT(sizeof(struct savefpu) == 512);
CTASSERT(sizeof(struct xstate_hdr) == 64);
CTASSERT(sizeof(struct savefpu_ymm) == 832);
@@ -514,11 +513,15 @@ static char fpetable[128] = {
};
/*
- * Preserve the FP status word, clear FP exceptions, then generate a SIGFPE.
+ * Preserve the FP status word, clear FP exceptions for x87, then
+ * generate a SIGFPE.
+ *
+ * Clearing exceptions was necessary mainly to avoid IRQ13 bugs and is
+ * engraved in our i386 ABI. We now depend on longjmp() restoring a
+ * usable state. Restoring the state or examining it might fail if we
+ * didn't clear exceptions.
*
- * Clearing exceptions is necessary mainly to avoid IRQ13 bugs. We now
- * depend on longjmp() restoring a usable state. Restoring the state
- * or examining it might fail if we didn't clear exceptions.
+ * For SSE exceptions, the exceptions are not cleared.
*
* The error code chosen will be one of the FPE_... macros. It will be
* sent as the second argument to old BSD-style signal handlers and as
@@ -531,8 +534,9 @@ static char fpetable[128] = {
* solution for signals other than SIGFPE.
*/
int
-fputrap()
+fputrap_x87(void)
{
+ struct savefpu *pcb_save;
u_short control, status;
critical_enter();
@@ -543,19 +547,33 @@ fputrap()
* wherever they are.
*/
if (PCPU_GET(fpcurthread) != curthread) {
- control = GET_FPU_CW(curthread);
- status = GET_FPU_SW(curthread);
+ pcb_save = PCPU_GET(curpcb)->pcb_save;
+ control = pcb_save->sv_env.en_cw;
+ status = pcb_save->sv_env.en_sw;
} else {
fnstcw(&control);
fnstsw(&status);
+ fnclex();
}
- if (PCPU_GET(fpcurthread) == curthread)
- fnclex();
critical_exit();
return (fpetable[status & ((~control & 0x3f) | 0x40)]);
}
+int
+fputrap_sse(void)
+{
+ u_int mxcsr;
+
+ critical_enter();
+ if (PCPU_GET(fpcurthread) != curthread)
+ mxcsr = PCPU_GET(curpcb)->pcb_save->sv_env.en_mxcsr;
+ else
+ stmxcsr(&mxcsr);
+ critical_exit();
+ return (fpetable[(mxcsr & (~mxcsr >> 7)) & 0x3f]);
+}
+
/*
* Implement device not available (DNA) exception
*
diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
index 75e15e0..57d1cc2 100644
--- a/sys/amd64/amd64/trap.c
+++ b/sys/amd64/amd64/trap.c
@@ -328,7 +328,7 @@ trap(struct trapframe *frame)
break;
case T_ARITHTRAP: /* arithmetic trap */
- ucode = fputrap();
+ ucode = fputrap_x87();
if (ucode == -1)
goto userout;
i = SIGFPE;
@@ -442,7 +442,9 @@ trap(struct trapframe *frame)
break;
case T_XMMFLT: /* SIMD floating-point exception */
- ucode = 0; /* XXX */
+ ucode = fputrap_sse();
+ if (ucode == -1)
+ goto userout;
i = SIGFPE;
break;
}
diff --git a/sys/amd64/include/fpu.h b/sys/amd64/include/fpu.h
index 98a016b..7d0f0ea 100644
--- a/sys/amd64/include/fpu.h
+++ b/sys/amd64/include/fpu.h
@@ -62,7 +62,8 @@ int fpusetregs(struct thread *td, struct savefpu *addr,
char *xfpustate, size_t xfpustate_size);
int fpusetxstate(struct thread *td, char *xfpustate,
size_t xfpustate_size);
-int fputrap(void);
+int fputrap_sse(void);
+int fputrap_x87(void);
void fpuuserinited(struct thread *td);
struct fpu_kern_ctx *fpu_kern_alloc_ctx(u_int flags);
void fpu_kern_free_ctx(struct fpu_kern_ctx *ctx);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-amd64/attachments/20120718/571356cf/attachment-0001.pgp
More information about the freebsd-amd64
mailing list