Crashes on arm caused by stack corruption

Weiß, Jürgen weiss at uni-mainz.de
Sat Mar 8 18:23:00 UTC 2014


For quite a while I observe random or not so random crashes on i.mx6. 

The svc stack grows into the undefined instruction exception stack,
as can be seen here. And there is corruption around the address in
the und_sp register.

db> show reg
spsr        0x600001d3
r0                   0
r1          0xc246f3c0  __pcpu
r2          0xc2438100  pcpup
r3          0xc2379039
r4          0xc6460000
r5          0xc23789e2
r6          0xc23733b4
r7          0xc6460000
r8                   0
r9                 0x1
r10                  0
r11         0xbfffe388
r12                  0
usr_sp      0xbfffdf68
usr_lr         0x21e04
svc_sp      0xf183afe8
svc_lr      0xc216add8  mi_switch+0x2b8
pc          0x4278f500
und_sp      0xf183aff0
abt_sp      0xc2565000
irq_sp      0xc2561000


The strange thing is, that there is an undefined instruction
exception stack undstack allocated for each core in initarm and
assigned to und_sp. 

But later on in cpu_switch, und_sp is loaded 
        ldr     sp, [r9, #(PCB_UND_SP)]
from un_32.pcb32_und_sp. Which is intialized to 
#define USPACE_UNDEF_STACK_TOP          (USPACE_SVC_STACK_BOTTOM - 0x10)
which comes from
#define USPACE_SVC_STACK_BOTTOM         (USPACE_SVC_STACK_TOP - 0x1000)
and effectively halves the svc stack.

The undefined instruction exception stack is almost not used, besides
a few words right at the beginning of the exception handling. The
exception frame is actually build on the svc stack. 

Now undefined instruction exceptions should only happen in user mode
(VFP). Then the used part of svc stack should be small enough, so that
no harm should result. But in cpu_throw the code for manipulating
the und_sp is actually missing. So sometimes undefined instruction 
exceptions write on the kernel stack of the wrong process/thread.

So seem to be two solutions. If I do not miss anything, I would
suggest to just drop the code, which switches the undefined 
instruction exception stack. The other is to add the missing
part to cpu_throw. See attached patch for both possibilities.

With any of both solutions the crashes on my system are gone. 

I think this problem affects all arm systems.

There is another problem with the handling of undefined
instructions. The first few instructions of the undefined instruction
exception handler use static variables and are definitively
not SMP save. I wonder why the code is not similar to the
prefetch abort handler or the data abort handler.

Regards

Juergen



-------------- next part --------------
A non-text attachment was scrubbed...
Name: und_stack.diff
Type: application/octet-stream
Size: 1652 bytes
Desc: und_stack.diff
URL: <http://lists.freebsd.org/pipermail/freebsd-arm/attachments/20140308/93dbd806/attachment.obj>


More information about the freebsd-arm mailing list