Fatal kernel trap during boot 8.0-CURRENT
Andreas Tobler
andreast-list at fgznet.ch
Thu Apr 23 19:26:46 UTC 2009
Bartlomiej Sieka wrote:
>
> On 2009-04-23, at 04:58, Nathan Whitehorn wrote:
>
>> Andreas Tobler wrote:
>>> Jochen Fahrner wrote:
>>>> Hi,
>>>> after a fews days of power off I wanted to boot the 8.0 kernel I
>>>> installed last week on my iMac G3.
>>>> I got a kernel trap after starting sc0 driver:
>>>>
>>>> =======================
>>>> sc0: Unknown <16 virtual consoles, flags=0x300>
>>>> Timecounter "decrementer" frequency 24960000 Hz quality 0
>>>> Timecounters tick every 10.000 msec
>>>>
>>>> fatal kernel trap:
>>>> exception = 0x7 (program)
>>>> srr0 = 0x509168
>>>> srr1 = 0x83032
>>>> lr = 0x4f9788
>>>> curthread = 0x633a30
>>>> pid = 0, comm=swapper
>>>> thread pid 0 tid 100000
>>>> Stopped at 0x509168
>>>> illegal instruction 7c0049ce
>>>> ==========================
>>>>
>>>> I could repeat this several times.
>>>> Then I booted my old 7.1 kernel without problems.
>>>> After that I also could boot 8.0 again.
>>>
>>> Fyi, I experience the same on my imac G3. And I use the same
>>> procedure to get back to -CURRENT.
>> This is related to Altivec support. 7c0049ce is stvx v0,r0,r9,
>> which is the first executed Altivec instruction in save_vec(), and the
>> faulting address is close to where to save_vec() ends up in my kernel.
>> save_vec() can only be called if the process is marked with PCB_VEC. I
>> have no idea how that ends up happening, and I can't duplicate the
>> problem on my G3. One option would be to insert a panic() or a
>> kdb_backtrace() into enable_vec(), which might at least tell us where
>> it is getting called from...
>>
>> The only thing I can think of is that the 750 is taking a performance
>> monitor exception and falling through to the EXC_VEC handler, which
>> will try to turn on Altivec. The way Altivec support works is that
>> only Altivec-aware processors should ever fault to EXC_VEC, in which
>> case we should be fine setting PCB_VEC on the process. Very confusing...
>
I added a kdb_backtrace at the beginning of save_vec:
Need to manually write down the trace....
0xd00048f0 at kdb_backtrace+0x4c
0xd0004910 at save_vec+0x1c
0xd0004930 at cpu_switch+0x54
0xd0004960 at mi_switch+0x290
0xd0004990 at sleepq_switch+0xcc
0xd00049b0 at sleepq_timedwait+0x58
0xd00049e0 at _cv_timedwait+0x1b4
0xd0004a20 at _sema_timedwait+0x84
0xd0004a50 at ata_queue_request+0x410
.....
If one needs the full stack trace I can mail privately a jpg, I don't
want to spam the list.
> Perhaps the problem is related to an issue we came across while working
> on Efika support. The issue was that the Altivec-specific code was
> executed, due to PCB_VEC being set when it shouldn't (Efika has the
> MPC5200B SoC, which is e300-based). PCB_VEC turned out the be set
> because thread0.td_pcb contained garbage, and our problem went away
> after zeroing the thread0.td_pcb in powerpc_init(), similarly to what
> booke/machdep.c implementation does.
>
> Please try the attached patch and see if it fixes the problem seen on
> iMac G3.
Bartlomiej, I did try your suggested fix and it looks good. So far I was
not able to reproduce the trap with your fix. While w/o fix I can nearly
every time trigger the trap on a cold boot, not always though, but in 8
of 10 tries.
Thank you very much!
Andreas
More information about the freebsd-ppc
mailing list