kern/177876: [mips] kernel stack overflow panic on mips64, EdgeRouter Lite
Joe Holden
lists at rewt.org.uk
Mon Apr 22 22:26:11 UTC 2013
Joe Holden wrote:
> On Apr 22, 2013, at 11:59 AM, Juli Mallett wrote:
>
>> On Mon, Apr 22, 2013 at 10:35 AM, Adrian Chadd <adrian at freebsd.org> wrote:
>>> Do an svn log in sys/mips/ or sys/vm/ and look at the changes.
>>>
>>> I don't know how far you can go back before you don't have the
>>> edgerouter lite support, but maybe you can try going back to when
>>> Juli initially committed it, and then just work your way forward.
>>>
>>> I think Juli did the initial work, so she knows when it came in.
>>>
>>> juli - I don't suppose you could spin up FreeBSD-HEAD on the
>>> edgerouter lite and take a look? It's highly likely someone messed up
>>> since you did your port. :(
>> I can't quite imagine why EdgeRouter Lite (or Octeon more generally)
>> could be a special case here; I'd be more inclined to think it was
>> generally 64-bit MIPS that would be broken. (A too-conservative
>> definition or something.) Except I was pretty sure I'd run -CURRENT
>> more recently than those changes.
>>
>> The only change that is suspect in mips/ since I made my changes is
>> Warner's change to include/regnum.h, which looks like there's the slim
>> possibility that it could screw up register saving in N64 builds.
>> That would mean that it wasn't tested with a 64-bit build, though,
>> which I'm sure Warner wouldn't be so sloppy as to do.
>>
>> Joe, can you try reverting 249523 and seeing if that fixes things for
>> you? It seems like this breaks the order of registers saved to the
>> PCB, which would break syscalls with more than 4 arguments, like mmap.
>> Even just looking at how the macros expand in the N64 case makes it
>> pretty clear that this change was made clumsily, e.g. from
>> exception.S:
>>
>> SAVE_REG($12, 8, $29)
>> SAVE_REG($13, 9, $29)
>> SAVE_REG($14, 10, $29)
>> SAVE_REG($15, 11, $29)
>> SAVE_REG($8, 12, $29)
>> SAVE_REG($9, 13, $29)
>> SAVE_REG($10, 14, $29)
>> SAVE_REG($11, 15, $29)
>>
>> For this to not break syscalls, struct trapframe would need to be
>> updated,
>
> Looking at the trapframe, you are right. <doh>. I did test boot a kernel
> with the change, but after-the-fact software forensics suggest I built the
> new kernel and tested the old one. I found the new one installed as
> kenrel.oct rather than kernel.oct which I test booted...
>
>> or the syscall handling code. Joe, can you confirm that backing out
>> 249523 fixes things for you? If it does, Adrian, would you be willing
>> to handle a backout? I can't imagine finding the time for a couple of
>> days, and if this is really so badly, unnecessarily broken, that
>> should be fixed immediately. I hope I'm wrong. Nobody should be
>> making incomplete changes on the basis of a half-baked reading of
>> purportedly-conflicting documentation, and without testing.
>> Yikes!
>
> <snip>
>
> I am just building a pre-commit kernel, but if you guys know what it is I'll
> wait for a fix :)
>
> Will this also fix the trapframe issue when the box is under heavy cpu load
> or is that a different issue?
>
Ok so that is confirmed, reverted regnum.h and it boots fine.
J
More information about the freebsd-mips
mailing list