How many segments does it take to span from VM_MIN_KERNEL_ADDRESS through VM_MAX_SAFE_KERNEL_ADDRESS? 128 in moea64_late_bootstrap
Justin Hibbits
chmeeedalf at gmail.com
Wed May 1 21:54:10 UTC 2019
On Wed, 1 May 2019 14:35:56 -0700
Mark Millard <marklmi at yahoo.com> wrote:
> >> What happens if you revert all your patches,
> >
> > Most of the patches in Bugzilla 233863 are not for this
> > issue at all and are not tied to starting the non-bsp
> > cpus. (The one for improving how close the Time Base
> > registers are is tied to starting these cpus.) Only the
> > aim/mp_cpudep.c and aim/slb.c changes seem relevant.
> >
> > Are you worried about some form of interaction that means
> > I need to avoid patches for other issues?
> >
> > Note: for now I'm staying at using head -r345758 as the
> > basis for my experiments.
> >
> >> and change this loop to
> >> stop at n_slb? So something more akin to:
> >>
> >> int i = 0;
> >>
> >> for (va = virtual_avail; va < virtual_end && i < n_slb -
> >> 1; va += SEGMENT_LENGTH, i++);
> >> ...
> >>
> >> If it reliably boots with that, then that's fine. We can prefault
> >> as much as we can and leave the rest for on-demand.
> >
> > I'm happy to experiment with this loop without my hack
> > for forcing the slb entry to exist in cpudep_ap_bootstrap.
> >
> > But, it seems to presume that the pc_curpcb's will
> > all always point into the lower address range spanned
> > when cpudep_ap_bootstrap is executing on the cpu.
> > Does some known property limit the pc_curpcb->
> > references to such? Only that would be sure to
> > avoid an slb-miss at that stage. Or is this just an
> > alternate hack or a means of getting evidence, not a
> > proposed solution?
> >
> > (Again, I'm happy to disable my hack that forces the
> > slb entry and to try the loop suggested.)
...
> And the patch for the loop looks like:
>
> virtual_end = VM_MAX_SAFE_KERNEL_ADDRESS;
>
> /*
> - * Map the entire KVA range into the SLB. We must not fault
> there.
> + * Map the lower-address part of the KVA range into the SLB.
> We must not fault there. */
> #ifdef __powerpc64__
> - for (va = virtual_avail; va < virtual_end; va +=
> SEGMENT_LENGTH)
> + i = 0;
> + for (va = virtual_avail; va < virtual_end && i<n_slbs-1; va
> += SEGMENT_LENGTH, i++) moea64_bootstrap_slb_prefault(va, 0);
> #endif
>
Yep, that's the patch I was going for.
>
> So I've built, installed, and have tested some: it did not go well
> overall.
>
> Using:
>
> OK set debug.verbose_sysinit=1
>
> to show better context about where the hangs occur, shows:
> (Typed from a screen picture.)
>
> subsystem a800000
> boot_run_interrupt_driven_config_hooks(0)...
> . . . (omitted) . . .
> done.
> vt_upgrade(&vt_consdev). . .
>
> The "vt_upgrade(&vt_consdev). . ." never says done when booting
> hangs with the above changes.
>
> Trying to boot a bunch of times did produce one
> completed boot, all 4 cpus working. Otherwise I'm
> using kernel.old to manage to complete a boot.
>
> I'll note that "vt_upgrade(&vt_consdev). . ." is where
> Dennis Clarke reported for the hangups that he was
> seeing, without any of my patches being available back
> then: 2019-Feb-14.
Maybe try the commit that caused the problem back in July? r334498.
> You wrote in another reply:
>
> > The idea with this is if you can test with stock -CURRENT (or
> > post-VM_KERNEL_MAXADDR change), to eliminate any other variables.
> > This is *only* for testing that it brings up the APs, not that
> > they're properly synced. That will happen with other changes.
> > This is a proposed solution. From my understanding, we typically
> > allocate from low to high for KVA allocations, so keeping the low
> > addresses in memory long enough to bring up the APs to sanity is
> > the goal, so the commit would be along the lines of "Prefault as
> > much of KVA as we can fit into the SLB".
>
> This will have the sleep-gets-stuck problem, likely normally happening
> quickly after booting and logging in (presuming a boot). The resulting
> boots for such are not always all that useful after various threads
> hang up.
As mentioned, that's a different problem to solve. If we can at least
get the APs going, that's a big step up in the first place.
- Justin
More information about the freebsd-ppc
mailing list