How many segments does it take to span from VM_MIN_KERNEL_ADDRESS through VM_MAX_SAFE_KERNEL_ADDRESS? 128 in moea64_late_bootstrap

Wed May 1 21:36:08 UTC 2019

[This just reports about the experiment, but not from an official
head version or snapshot: preliminary information in the interest
of time. It hangs, but in a different place/stage than
cpudep_ap_bootstrap , matching Dennis Clarke's 2019-Feb-14 reports
about hangups, from before my patches were available.]

On 2019-May-1, at 11:51, Mark Millard <marklmi at yahoo.com> wrote:

> On 2019-May-1, at 07:40, Justin Hibbits <chmeeedalf at gmail.com> wrote:
> 
>> On Tue, 30 Apr 2019 21:45:00 -0700
>> Mark Millard <marklmi at yahoo.com> wrote:
>> 
>>> [I realized another implication about a another point of
>>> potential slb-misses in cpudep_ap_bootstrap: the
>>> address in sprg0 on the cpu might end up not able to be
>>> dereferenced.]
>>> 
>>> On 2019-Apr-30, at 20:58, Mark Millard <marklmi at yahoo.com> wrote:
>>> 
>>>> [At the end this note shows why the old VM_MAX_KERNEL_ADDRESS
>>>> lead to no slb-miss exceptions in cpudep_ap_bootstrap.]
>>>> 
>>>> There is code in moea64_late_bootstrap that looks like:
>>>> 
>>>>      virtual_avail = VM_MIN_KERNEL_ADDRESS;
>>>>      virtual_end = VM_MAX_SAFE_KERNEL_ADDRESS;
>>>> 
>>>>      /*
>>>>       * Map the entire KVA range into the SLB. We must not fault
>>>> there. */
>>>>      #ifdef __powerpc64__
>>>>      for (va = virtual_avail; va < virtual_end; va +=
>>>> SEGMENT_LENGTH) moea64_bootstrap_slb_prefault(va, 0);
>>>>      #endif
>> 
>> What happens if you revert all your patches,
> 
> Most of the patches in Bugzilla 233863 are not for this
> issue at all and are not tied to starting the non-bsp
> cpus. (The one for improving how close the Time Base
> registers are is tied to starting these cpus.) Only the
> aim/mp_cpudep.c and aim/slb.c changes seem relevant.
> 
> Are you worried about some form of interaction that means
> I need to avoid patches for other issues?
> 
> Note: for now I'm staying at using head -r345758 as the
> basis for my experiments.
> 
>> and change this loop to
>> stop at n_slb?  So something more akin to:
>> 
>> 	int i = 0;
>> 
>> 	for (va = virtual_avail; va < virtual_end && i < n_slb - 1; va
>> 	+= SEGMENT_LENGTH, i++);
>> 		...
>> 
>> If it reliably boots with that, then that's fine.  We can prefault as
>> much as we can and leave the rest for on-demand.
> 
> I'm happy to experiment with this loop without my hack
> for forcing the slb entry to exist in cpudep_ap_bootstrap.
> 
> But, it seems to presume that the pc_curpcb's will
> all always point into the lower address range spanned
> when cpudep_ap_bootstrap is executing on the cpu.
> Does some known property limit the pc_curpcb->
> references to such? Only that would be sure to
> avoid an slb-miss at that stage. Or is this just an
> alternate hack or a means of getting evidence, not a
> proposed solution?
> 
> (Again, I'm happy to disable my hack that forces the
> slb entry and to try the loop suggested.)

Note: I've not started any experiments for isync's
related to instructions such as slbmte yet: that was
all just inspection and reading about requirements
so far.

So to disable my slb-force-no-miss hack in
cpudep_ap_bootstrap I reverted it:

# svnlite revert /usr/src/sys/powerpc/aim/mp_cpudep.c /usr/src/sys/powerpc/aim/slb.c
Reverted 'sys/powerpc/aim/mp_cpudep.c'
Reverted 'sys/powerpc/aim/slb.c'

(hack_into_slb_if_needed(...) was implemented in mp_cpudep.c and
used in slb.c before reverting.)

And the patch for the loop looks like:

 	virtual_end = VM_MAX_SAFE_KERNEL_ADDRESS; 

 	/*
-	 * Map the entire KVA range into the SLB. We must not fault there.
+	 * Map the lower-address part of the KVA range into the SLB. We must not fault there.
 	 */
 	#ifdef __powerpc64__
-	for (va = virtual_avail; va < virtual_end; va += SEGMENT_LENGTH)
+	i = 0;
+	for (va = virtual_avail; va < virtual_end && i<n_slbs-1; va += SEGMENT_LENGTH, i++)
 		moea64_bootstrap_slb_prefault(va, 0);
 	#endif

So I've built, installed, and have tested some: it did not go well
overall.

Using:

OK set debug.verbose_sysinit=1

to show better context about where the hangs occur, shows:
(Typed from a screen picture.)

subsystem a800000
  boot_run_interrupt_driven_config_hooks(0)...
. . . (omitted) . . .
done.
  vt_upgrade(&vt_consdev). . .

The "vt_upgrade(&vt_consdev). . ." never says done when booting
hangs with the above changes.

Trying to boot a bunch of times did produce one
completed boot, all 4 cpus working. Otherwise I'm
using kernel.old to manage to complete a boot.

I'll note that "vt_upgrade(&vt_consdev). . ." is where
Dennis Clarke reported for the hangups that he was
seeing, without any of my patches being available back
then: 2019-Feb-14.

You wrote in another reply:

> The idea with this is if you can test with stock -CURRENT (or
> post-VM_KERNEL_MAXADDR change), to eliminate any other variables.  This
> is *only* for testing that it brings up the APs, not that they're
> properly synced.  That will happen with other changes.  This is a
> proposed solution.  From my understanding, we typically allocate from
> low to high for KVA allocations, so keeping the low addresses in memory
> long enough to bring up the APs to sanity is the goal, so the commit
> would be along the lines of "Prefault as much of KVA as we can fit into
> the SLB".

This will have the sleep-gets-stuck problem, likely normally happening
quickly after booting and logging in (presuming a boot). The resulting
boots for such are not always all that useful after various threads
hang up.

Also, getting such a almost-exactly-head-revision variant set up without
messing up my current context will take some time: I'm not set up for
such. I currently have no access to a cross-build environment, the
activity is self hosted on a 2-socket/2-cores-each G5. So I will have to
build from a context that has patches (or is too old).

Thus the preliminary results above that I could produce quickly that
are not from the context that you asked for.

But it also appears that "vt_upgrade(&vt_consdev). . ." would not
be tied to cpudep_ap_bootstrap and evaluating:

sp = pcpup->pc_curpcb->pcb_sp

Still, I'll work on having a gcc-4.2.1-based just-head context built,
not that it would install and boot in that state. So I will have to
build from a context that has patches, using a different source
tree for the "self-hosted cross build" to do your kind of experiment.
But I'd then be ready for "self-hosted cross built" experiments.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)