My PowerMac G5's no longer crash at boot: PowerMac G5 specific ofwcall changes with justifying evidence

Mark Millard markmi at
Mon Oct 13 00:53:59 UTC 2014

NOTE: I make no claim that any of the below hacks for ofwcall are appropriate code for FreeBSD's general context. I only claim that it seems to make the specific PowerMac G5 problem go away, gives solid evidence for at least some of what is going on (justifying the investigative and testing hacks) and so gives evidence for an appropriate, more general FreeBSD solution.

The big issue is: The PowerMac G5 openfirmware does not always preserve the %r1 value (the stack pointer contents) that it is initially given, at least when the early "before copyright" crash problem is happening but possibly other times as well.

I had the following investigative code in ofwcall, snapshotting the value of %r1 before and after openfirmware's code is used:

 	lis	%r4,openfirmware_entry at ha
 	ld	%r4,openfirmware_entry at l(%r4)
 	mr   %r17,%r1 /* ADDED HACK TO RECORD %r1 before...
 	/* Finally, branch to OF */
 	mtctr	%r4
 	mr   %r18,%r1 /* ADDED HACK TO RECORD %r1 after...

then the DDB show registers from the crash that I'd hacked in would show these values instead of the zeros they otherwise always display, in addition to what the show registers has always shown for r1.

The results were like the following example for every such crash:

r17 = 0xC31400 ofwstk+0xfe0
r18 = 0xd24450
r1  = 0xd24450

Because of that %r1 value the later code such as:

 	/* Reload stack pointer and MSR from the OFW stack */
 	ld	%r6,24(%r1)
 	ld	%r2,16(%r1)
 	ld	%r1,8(%r1)

gets garbage-in/garbage-out results, including %r6 being values like 0xbc0568 instead of the value saved msr to later be restored: 0x9000000000001032.

So one PowerMac G5 specific hack involved in my working-boots context is to force the original %r1 value to be used (based on %r17 being a before-call copy, similar to the above):

 	ld	%r6,24(%r17)
 	ld	%r2,16(%r17)
 	ld	%r1,8(%r17)

But the exception report from DDB has had problems in part because sprg0 still has the openfirmware value at the time even though the exception is after openfirmware returned (the wrong value results in the register for GET_CPUINFO(<register>). So I hacked in a before-exception restore of FreeBSD's sprg0 inside ofwcall to make the exception handler code have that much FreeBSD context available at the exception (if it occurs, anyway). This was really just to help with information gathering, although I've not tested only having the %r17 changes.

So overall PowerMac G5 specific hacking the ofwcall code to have instead (based on what was reported above):

root at FBSDG5M1:~ # svnlite diff /usr/src/sys/powerpc/ofw/ofwcall64.S
Index: /usr/src/sys/powerpc/ofw/ofwcall64.S
--- /usr/src/sys/powerpc/ofw/ofwcall64.S	(revision 272558)
+++ /usr/src/sys/powerpc/ofw/ofwcall64.S	(working copy)
@@ -52,6 +52,12 @@
 	.llong	0			/* RTAS entry point */
+ /* HACK: part of having sprg0 in place for trap */
+	.space	8 /* sizeof(register_t) */
+	.llong	0
  * Open Firmware Real-mode Entry Point. This is a huge pain.
@@ -97,6 +103,10 @@
 	lis	%r4,openfirmware_entry at ha
 	ld	%r4,openfirmware_entry at l(%r4)
+	/* HACK: part of having FreeBSD's sprg0 in place for the exception problem */
+	lis	%r14,ofw_sprg0_save at ha
+	ld	%r14,ofw_sprg0_save at l(%r14)
 	 * Set the MSR to the OF value. This has the side effect of disabling
 	 * exceptions, which is important for the next few steps.
@@ -123,14 +133,27 @@
 	stw	%r5,4(%r1)
 	stw	%r5,0(%r1)
+	/* HACK: part of having FreeBSD's sprg0 in place for the exception problem */
+	lis	%r6,ofwsprg0save at ha
+	std	%r14,ofwsprg0save at l(%r6)
+	/* HACK: part of IGNORING the later %r1 value from openfirmware */
+	mr	%r17,%r1
 	/* Finally, branch to OF */
 	mtctr	%r4
+	/* HACK: part of having FreeBSD's sprg0 in place for the exception problem */
+	lis	%r6,ofwsprg0save at ha
+	ld	%r6,ofwsprg0save at l(%r6)
+	mtsprg0	%r6
 	/* Reload stack pointer and MSR from the OFW stack */
-	ld	%r6,24(%r1)
-	ld	%r2,16(%r1)
-	ld	%r1,8(%r1)
+	/* HACKED to ignore the %r1 value that results from openfirmware's call */
+	ld	%r6,24(%r17)
+	ld	%r2,16(%r17)
+	ld	%r1,8(%r17)
 	/* Now set the real MSR */
 	mtmsrd	%r6

This results in no crashes happening so far in my testing, not even the 16 GByte RAM machine that crashed so much.

NOTE: owf_machdep.c was changed to use "extern register_t ofw_sprg0_save;" to match the above.

I still have ps3 disabled in GENERIC64 so that I can also have the sc options in GENERIC64. And the DDB and GDB options are still present as well.

And I still have my hack to force a DDB script that does show registers and shows the ofwcall history information that I hacked in, even for the very early crashes before input is possible. Not that I'm now getting such executions of the script. (A before possible-crash backtrace is also shown by the added code. That still shows up.)

I'll probably next switch to reverting the DDB related code changes and to removing the DDB/GDB options and see how that goes.

Mark Millard
markmi at

More information about the freebsd-ppc mailing list