I've submitted 207175 for a clang 3.8.0 va_list handling problem for powerpc
Mark Millard
markmi at dsl-only.net
Sun Feb 14 22:34:34 UTC 2016
On 2016-Feb-14, at 11:29 AM, Roman Divacky <rdivacky at vlakno.cz> wrote:
>
> Fwiw, the code to handle the vaarg is in
> tools/clang/lib/CodeGen/TargetInfo.cpp:PPC32_SVR4_ABIInfo::EmitVAArg()
>
> You can take a look to see whats wrong.
>
> On Sat, Feb 13, 2016 at 07:03:29PM -0800, Mark Millard wrote:
>> I've isolated another clang 3.8.0 TARGET_ARCH=powerpc SEGV problem that shows up for using clang 3.8.0 to buildworld/installworld for powerpc.
>>
>>> ls -l -n /
>>
>> gets a SEGV. As listed in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=207175 ( and https://llvm.org/bugs/show_bug.cgi?id=26605 ) the following simplified program also gets the SEGV on powerpc:
>>
>>> #include <stdarg.h> // for va_list, va_start, va_arg, va_end
>>> #include <stdint.h> // for intmax_t
>>>
>>> intmax_t
>>> va_test (char *s, ...)
>>> {
>>> va_list vap;
>>>
>>> va_start(vap, s);
>>>
>>> char* t0 = va_arg(vap, char*);
>>> unsigned int o0 = va_arg(vap, unsigned int);
>>> int c0 = va_arg(vap, int);
>>> unsigned int u0 = va_arg(vap, unsigned int);
>>> int c1 = va_arg(vap, int);
>>> char * t1 = va_arg(vap, char*);
>>>
>>> intmax_t j0 = va_arg(vap, intmax_t); // This spans into overflow_arg_area.
>>>
>>> int c2 = va_arg(vap, int); // A copy was put in the
>>> // overflow_arg_area because of the
>>> // above.
>>> // But this tries to extract from the
>>> // last 4 bytes of the reg_save_area.
>>> // It does not increment the
>>> // overflow_arg_area position pointer
>>> // past the copy that is there.
>>>
>>> char * t2 = va_arg(vap, char*); // The lack of increment before makes
>>> // this extraction off by 4 bytes.
>>>
>>> char t2fc = *t2; // <<< This gets SEGV. t2 actually got what should be
>>> // the c2 value.
>>>
>>> intmax_t j1 = va_arg(vap, intmax_t);
>>>
>>> va_end(vap);
>>>
>>> return (intmax_t) ((s-t2)+(t0-t1)+o0+u0+j0+j1+c0+c1+c2+t2fc);
>>> // Avoid any optimize-away for lack of use.
>>> }
>>>
>>> int main(void)
>>> {
>>> char s[1025] = "test string for this";
>>>
>>> char* t0 = s + 5;
>>> unsigned int o0 = 3;
>>> int c0 = 1;
>>> unsigned int u0 = 1;
>>> int c1 = 3;
>>> char * t1 = s + 12;
>>> intmax_t j0 = 314159265358979323;
>>> int c2 = 4;
>>> char * t2 = s + 16;
>>> intmax_t j1 = ~314159265358979323;
>>>
>>> intmax_t result = va_test(s,t0,o0,c0,u0,c1,t1,j0,c1,t2,j1);
>>>
>>> return (int) (result - (intmax_t) ((s-t2)+(t0-t1)+o0+u0+j0+j1+c0+c1+c2+*t2));
>>> // Avoid any optimize-away for lack of use.
>>> }
>>
>>
>>
>> ===
>> Mark Millard
>> markmi at dsl-only.net
>>
>> _______________________________________________
>> freebsd-toolchain at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
>> To unsubscribe, send any mail to "freebsd-toolchain-unsubscribe at freebsd.org"
clang's code base is not familiar material for me nor do I have solid reference material for the FreeBSD TARGET_ARCH=powerpc ABI rules so the below has my guess work involved.
The following code appears to have hard wired a global, unvarying constant (8) into the test for picking UsingRegs vs. UsingOverflow.
> llvm::Value *NumRegs = Builder.CreateLoad(NumRegsAddr, "numUsedRegs");
. . .
> llvm::Value *CC =
> Builder.CreateICmpULT(NumRegs, Builder.getInt8(8), "cond");
>
> llvm::BasicBlock *UsingRegs = CGF.createBasicBlock("using_regs");
> llvm::BasicBlock *UsingOverflow = CGF.createBasicBlock("using_overflow");
> llvm::BasicBlock *Cont = CGF.createBasicBlock("cont");
>
> Builder.CreateCondBr(CC, UsingRegs, UsingOverflow);
. . .
> // Case 1: consume registers.
> Address RegAddr = Address::invalid();
> {
. . .
> // Increase the used-register count.
> NumRegs =
> Builder.CreateAdd(NumRegs,
> Builder.getInt8((isI64 || (isF64 && IsSoftFloatABI)) ? 2 : 1));
> Builder.CreateStore(NumRegs, NumRegsAddr);. . .
. . .
> }
>
> // Case 2: consume space in the overflow area.
> Address MemAddr = Address::invalid();
> {
. . . (no adjustments to NumRegs) . . .
If so the means of counting NumRegs (a.k.a. gpr) then needs to take into account an allocated but unused last UsingRegs "slot" sometimes. Imagine. . .
r3, r4, r5, r6, r7, r8, r9 in use already so r10 is the last possible "UsingRegs" context.
(0 1 2 3 4 5 6, leaving r10 as position 7, the last < 8 value)
Then the next two arguments are a 8 byte integer then a a 4 byte integer (in that order). That results in what should be:
r10 "UsingRegs" slot reserved and un-accessed
In other words: counted as allocated so that the rest goes in in the overflow area
(so no position 7 usage)
then
overflow with the 8 byte integer then the 4 byte integer.
And, in fact, the memory content reflects this in the overflow area.
But the va_arg access code does not count r10's slot as allocated in "Using Regs" after the 8 byte integer. So later it tries to use r10's slot for the 4 byte integer that is actually in the UsingOverflow area.
One fix of sorts is to have "Case 2: consume space in the overflow area." set NumRegs (a.k.a. gpr) to the bound from the Builder.CreateICmpULT (8 in this context). Then the first (or any/every) use of the UsingOverflow area forces no more use of the UsingRegs area (for the involved va_list).
===
Mark Millard
markmi at dsl-only.net
More information about the freebsd-ppc
mailing list