Some evidence about the PowerMac G5 multiprocessor boot hang ups with the modern VM_MAX_KERNEL_ADDRESS value
Mark Millard
marklmi at yahoo.com
Sat Feb 16 04:32:44 UTC 2019
[I've had to search for an address that would not have
my values corrupted/replaced. I did not find one. I've
added the assignment requested before the PCPU_SET but
until I find an address to use that preserves the values
that I assign, it likely does not matter.]
On 2019-Feb-15, at 16:04, Justin Hibbits <chmeeedalf at gmail.com> wrote:
> On Fri, 15 Feb 2019 15:26:09 -0800
> Mark Millard <marklmi at yahoo.com> wrote:
>
>> On 2019-Feb-15, at 14:09, Justin Hibbits <chmeeedalf at gmail.com>
>> wrote:
>>
>>> On Fri, 15 Feb 2019 14:01:18 -0800
>>> Mark Millard <marklmi at yahoo.com> wrote:
>>>
>>>> . . .
>>>>
>>>> Just to be sure, was the 0xc prefix a typo
>>>> (vs. 0xe as a prefix)?:
>>>>
>>>> 0xc000000000000010
>>>> vs.
>>>> 0xe000000000000010
>>>
>>> No, 0xc is correct. 0xc... is the address of the DMAP, and it so
>>> happens that the upper bits are ignored in real mode, simply by the
>>> fact that they're not placed onto the address bus. We take
>>> advantage of that elsewhere as well. So writing to 0xc000....10
>>> actually writes to 0x0000...10, both in real mode and translated
>>> mode. Writing to this at various points when the AP is starting
>>> up, we can see just how far into the boot it gets.
>>>
>>>> . . .
>>
>> I got an odd result from a successful boot. But first
>> notes what I did to the code:
>>
>> I used 0xc000000000000010 via:
>>
>> + *(unsigned long*)0xc000000000000010 = 0x10; // HACK!!!
>> + powerpc_sync(); // HACK!!!
>>
>> just before returning from cpudep_ap_early_bootstrap
>>
>> + *(unsigned long*)0xc000000000000010 = 0x20; // HACK!!!
>> + powerpc_sync(); // HACK!!!
>>
>> just before return from pmap_cpu_bootstrap
>>
>> + *(unsigned long*)0xc000000000000010 = 0x30; // HACK!!!
>> + powerpc_sync(); // HACK!!!
>>
>> just before return from cpudep_ap_bootstrap
>>
>> + *(unsigned long*)0xc000000000000010 = 0x40; // HACK!!!
>> + powerpc_sync(); // HACK!!!
>>
>> just before returning from cpudep_ap_setup
>>
>> + *(unsigned long*)0xc000000000000010 = 0x51; // HACK!!!
>> + powerpc_sync(); // HACK!!!
>>
>> just before the ap_letgo loop in machdep_ap_boostrap [so just
>> after the PCPU_SET(away,1)]
>>
>> + *(unsigned long*)0xc000000000000010 = 0x50; // HACK!!!
>> + powerpc_sync(); // HACK!!!
>>
>> just before sched_throw(NULL) in machdep_ap_bootstrap
>>
>>
>> For CPU 3 just after the two (void)*rstvec related
>> code sequences powermac_smp_start_cpu reported:
>>
>> *(unsigned long*)0xc000000000000010=0xffa34878A
>>
>> For CPU 2 just after the two (void)*rstvec related
>> code sequences powermac_smp_start_cpu reported:
>>
>> *(unsigned long*)0xc000000000000010=0x51
>>
>> For CPU 1 just after the two (void)*rstvec related
>> code sequences powermac_smp_start_cpu reported:
>>
>> *(unsigned long*)0xc000000000000010=0x51
>>
>> It looks to me like something is using the memory
>> that 0xc000000000000010 maps to.
>>
>> None of them reported the 0x50 from just before
>> the sched_throw(NULL) .
>>
>>
>> ===
>> Mark Millard
>> marklmi at yahoo.com
>> ( dsl-only.net went
>> away in early 2018-Mar)
>>
>
> Interesting. That value looks like it could be an OpenFirmware
> phandle. PowerISA does state that the first 256 bytes of memory is
> free for the OS (or firmware) to use as it sees fit, and we already
> know address 0x80 is special for OF. Maybe pick another address if you
> wish to continue this experiment. Can you write at the beginning of
> machdep_ap_bootstrap() some value, just before the PCPU_SET()? And then
> right after the sync?
Using 0xc000000000000020 resulted in the CPU 3 case
showing:
*(unsigned long*)0xc000000000000020=0x0
CPU 2 and CPU 1 again showed 0x51, as expected.
The same happened for 0xc000000000000030 .
After that I added the 0x5F hack shown below
(showing the 0xc0...40 address attempt):
void
machdep_ap_bootstrap(void)
{
*(unsigned long*)0xc000000000000040 = 0x5F; // HACK!!!
powerpc_sync(); // HACK!!!
PCPU_SET(awake, 1);
__asm __volatile("msync; isync");
*(unsigned long*)0xc000000000000040 = 0x51; // HACK!!!
powerpc_sync(); // HACK!!!
while (ap_letgo == 0)
__asm __volatile("or 31,31,31");
__asm __volatile("or 6,6,6");
. . .
Then I continued my search for an address where my assigned
values would survive over the duration required.
The same happened for 0xc000000000000040 .
The same happened for 0xc000000000000050 .
The same happened for 0xc000000000000060 .
The same happened for 0xc000000000000070 .
Is there another reasonable address range to try?
(I've not tried any 0xc0000000000000?8 addresses.)
I'll remind that machdep_ap_bootstrap for CPU 3
does echo its own messages even when the hang up happens,
proving that it gets past the PCPU_SET(awake,1) and
the ap_letgo loop.
May be whatever clobbers 0xc0000000000000?0 content
sometimes clobbers something important to getting
pc_awake for CPU 3 set in the right place and to the
handling of CPU 2?
===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
More information about the freebsd-ppc
mailing list