backtrace information from the 2nd(?) most common boot crash place on PowerMac G5's: just after real memory = ... (... MB)

Mark Millard markmi at dsl-only.net
Tue Sep 30 20:49:40 UTC 2014


While crashing between the the real memory message place and the avail message message place in the sequence of messages has been the second most common place in the message sequence to fail, it has been rather rare: In months I've only seen it a few times despite all my reboots from the primary crash place issue and for deliberate testing/evidence finding about the boot crashes. (My primary FreeBSD activity is exploring FreeBSD via investigating the problems I have, primarily the early boot crash.)

I've seen it crash there on a variety of versions since I've been updating regularly but it crashes there only rarely. Only in recent times have I been building from source instead of using the MANIFEST and *.txz files with bsdinstall or before that using the .iso images. It crashed there back before I'd ever installed a kernel or world via my own build.

Of course with the DDB dump hack in place I get more information as things are now.

Unfortunately with the rarity I'm not able to effectively test if a specific installation/version/build/... has the problem or not. The best that I've got is to report the information I get on the rare occasion it does fail.

The 3 PowerMac G5's have: 8 GB (Dual processor), 12 GB (Quad core), 16 GB (Quad core). I have tended to use the 16 GB one primarily/mostly. But it sounds like I should switch to one of the others as the primary for a few months to see how things go for this issue.

While I think that I've seen that stopping place on more than one of the G5's it is possible that I'm wrong about that given the rarity. If I'm right about it then I may never have seen the problem on the 8 GB Dual processor one: more likely the two Quad cores.  But, again, I'm not sure. I tend to use the Dual processor one the least by a noticeable amount, however.

I certainly have seen the primary crash relative to message timing (before the Copyright notice) on all 3 G5's ever since I started exploring FreeBSD. Of course only with the DDB dump hack in place do I have evidence of just where those crashes happen internally.

I have reported one backtrace that is earlier then the first ofwcall with pmap_bootstrapped!=0.  It is the only example I have of that so far. Again: before the DDB hack I'd not have the evidence to make the distinction in place and it seems too rare to deliberately test for any specific build/version having the problem.

I've not tested if the .iso's still have the during-openfirmware loading boot-hang problem on the G5's in some time. So I do not know the status for that.



I'm really curious what the explanation is for the first ofwcall with pmap_bootstrapped!=0 sometimes failing and sometimes not. And similarly for other variability --but the other crashes seem to rare to have much chance of learning the answer.





===
Mark Millard
markmi at dsl-only.net

On Sep 30, 2014, at 8:09 AM, Nathan Whitehorn <nwhitehorn at freebsd.org> wrote:

How much RAM is in the machine? I've never ever heard of this happening before and have been using one of these daily for four years. Clearly, there's something special about your configuration. This error, in particular, means that the direct map has been evicted from the page table. I can't imagine any possible way for that to happen; it's basically the least likely fault that I can think of and almost certainly indicates memory corruption or a hardware fault. Do you see this with an unmodified 10.1-BETA2 kernel?
-Nathan

On 09/27/14 00:47, Mark Millard wrote:
> The following includes backtrace information from the 2nd most common boot crash place in the boot message sequence on PowerMac G5's: just after it reports
> 
> real memory = ... (... MB).
> 
> Classically it reports data storage interrupt here and it did again. But more is dumped in my current configuration than before.
> 
> FreeBSD FBSDG5M1 10.1-BETA2 FreeBSD 10.1-BETA2 #16 r271944M: Fri Sep 26 23:01:54 PDT 2014     root at FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64  powerpc
> 
> but with options DDB and DGB in GENERIC64, WITH_DEBUG_FILES=, WITHOUT_CLANG=, WIHT_DEBUG= in /etc/make.conf. Also: DDB hacked to dump various things automatically so it happens during early boot crashes/hangs.
> 
> The information reported was...
> 
> fatal kernel trap
> 
> exception = 0x300 (data storage interrupt)
> virtual address = 0x75e0000
> dsisr = 0x42000000
> curthread = 0xdbc290
> pid = 0, comm =
> 
> srr0: 0x885608 .moea64_zero_page+1ac (a dcbz r0,r10)
> lr: 0x8ba31c .pmap_zero_page+0x7c
> ctr: 0x88545c .moea64_zero_page
> 
> 0x8ba318: .pmap_zero_page+0x78
> 0x84167c: .kmem_back+0x2d0
> 0x8417fc: .kmem_malloc+0x7c
> 0x840dc4: .vm_ksubmap_init+0x8c
> 0x882130: .cpu_startup+0x10c
> 0x4d9c10: .mi_startup+0x10c
> btext+0xbc (???)
> 
> r0: 0x1
> r1: 0xc000000000008740
> r2: 0xd19468
> r3: 0xe4d3a8 mmu_kernel_obj
> r4: 0xc000000002bfc290
> r5: 0xc7dfa0 mmu_zero_page_desc
> r6: 0xc000000000063af8
> r7: 0x2
> r8: 0xe0c310 vm_phys_free_queues
> r9: 0x80 dbsize+0xc
> r10: 0x7f5e0000
> r11: 0x80 dbsize_0xc
> r12: 0x24042042
> r13: 0xdbc290 thread0
> r14-r19: all 0
> r20: 0x10c2000
> r21: 0x4
> r22: 0x163f000
> r23: 0xc0000000d03fd000
> r24: 0x3800
> r25: 0x262
> r26: 0x400000000000000
> r27: 0xe4d3a8 mmu_kernel_obj
> r28: 0xc000000002bfc290
> r29: 0xc000000002bfc290 (yes: again)
> r30: 0x75e0000
> r31: 0xc000000000008740
> 
> cr: 0x44042044
> xer: 0
> (I did not write down srr1. Drat.)
> 
> ===
> Mark Millard
> markmi at dsl-only.net
> 
> _______________________________________________
> freebsd-ppc at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-ppc
> To unsubscribe, send any mail to "freebsd-ppc-unsubscribe at freebsd.org"
> 




More information about the freebsd-ppc mailing list