svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311
Justin Hibbits
chmeeedalf at gmail.com
Fri Jun 12 02:25:37 UTC 2020
On Thu, 11 Jun 2020 17:30:24 -0700
Mark Millard <marklmi at yahoo.com> wrote:
> On 2020-Jun-11, at 16:49, Mark Millard <marklmi at yahoo.com> wrote:
>
> > On 2020-Jun-11, at 14:42, Justin Hibbits <chmeeedalf at gmail.com>
> > wrote:
> >
> > On Thu, 11 Jun 2020 14:36:37 -0700
> > Mark Millard <marklmi at yahoo.com> wrote:
> >
> >> On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com>
> >> wrote:
> >>
> >>> On Wed, 10 Jun 2020 18:56:57 -0700
> >>> Mark Millard <marklmi at yahoo.com> wrote:
> > . . .
> >>
> >>
> >>> That said, the attached patch effectively copies
> >>> what's done in OEA6464 into OEA pmap. Can you test it?
> >>
> >> I'll try it once I get a chance, probably later
> >> today.
> >> . . .
> >
> > No luck at the change being a fix, I'm afraid.
> >
> > I verified that the build ended up with
> >
> > 00926cb0 <moea_protect+0x2ec> bl 008e8dc8 <PHYS_TO_VM_PAGE>
> > 00926cb4 <moea_protect+0x2f0> mr r27,r3
> > 00926cb8 <moea_protect+0x2f4> addi r3,r3,36
> > 00926cbc <moea_protect+0x2f8> hwsync
> > 00926cc0 <moea_protect+0x2fc> lwarx r25,0,r3
> > 00926cc4 <moea_protect+0x300> li r4,0
> > 00926cc8 <moea_protect+0x304> stwcx. r4,0,r3
> > 00926ccc <moea_protect+0x308> bne- 00926cc0 <moea_protect+0x2fc>
> > 00926cd0 <moea_protect+0x30c> andi. r3,r25,128
> > 00926cd4 <moea_protect+0x310> beq 00926ce0 <moea_protect+0x31c>
> > 00926cd8 <moea_protect+0x314> mr r3,r27
> > 00926cdc <moea_protect+0x318> bl 008e9874 <vm_page_dirty_KBI>
> >
> > in the installed kernel. So I doubt a
> > mis-build would be involved. It is a
> > head -r360311 based context still. World is
> > without MALLOC_PRODUCTION so that jemalloc
> > code executes its asserts, catching more
> > and earlier than otherwise.
> >
> > First test . . .
> >
> > The only thing that the witness kernel reported was:
> >
> > Jun 11 15:58:16 FBSDG4S2 kernel: lock order reversal:
> > Jun 11 15:58:16 FBSDG4S2 kernel: 1st 0x216fb00 Mountpoints (UMA
> > zone) @ /usr/src/sys/vm/uma_core.c:4387 Jun 11 15:58:16 FBSDG4S2
> > kernel: 2nd 0x1192d2c kernelpmap (kernelpmap) @
> > /usr/src/sys/powerpc/aim/mmu_oea.c:1524 Jun 11 15:58:16 FBSDG4S2
> > kernel: stack backtrace: Jun 11 15:58:16 FBSDG4S2 kernel: #0
> > 0x5ec164 at witness_debugger+0x94 Jun 11 15:58:16 FBSDG4S2 kernel:
> > #1 0x5ebe3c at witness_checkorder+0xb50 Jun 11 15:58:16 FBSDG4S2
> > kernel: #2 0x536d5c at __mtx_lock_flags+0xcc Jun 11 15:58:16
> > FBSDG4S2 kernel: #3 0x92636c at moea_kextract+0x5c Jun 11 15:58:16
> > FBSDG4S2 kernel: #4 0x965d30 at pmap_kextract+0x98 Jun 11 15:58:16
> > FBSDG4S2 kernel: #5 0x8bfdbc at zone_release+0xf0 Jun 11 15:58:16
> > FBSDG4S2 kernel: #6 0x8c7854 at bucket_drain+0x2f0 Jun 11 15:58:16
> > FBSDG4S2 kernel: #7 0x8c728c at bucket_free+0x54 Jun 11 15:58:16
> > FBSDG4S2 kernel: #8 0x8c74fc at bucket_cache_reclaim+0x1bc Jun 11
> > 15:58:16 FBSDG4S2 kernel: #9 0x8c7004 at zone_reclaim+0x128 Jun 11
> > 15:58:16 FBSDG4S2 kernel: #10 0x8c3a40 at uma_reclaim+0x170 Jun 11
> > 15:58:16 FBSDG4S2 kernel: #11 0x8c3f70 at uma_reclaim_worker+0x68
> > Jun 11 15:58:16 FBSDG4S2 kernel: #12 0x50fbac at fork_exit+0xb0 Jun
> > 11 15:58:16 FBSDG4S2 kernel: #13 0x9684ac at fork_trampoline+0xc
> >
> > The processes that were hit were listed as:
> >
> > Jun 11 15:59:11 FBSDG4S2 kernel: pid 971 (cron), jid 0, uid 0:
> > exited on signal 11 (core dumped) Jun 11 16:02:59 FBSDG4S2 kernel:
> > pid 1111 (stress), jid 0, uid 0: exited on signal 6 (core dumped)
> > Jun 11 16:03:27 FBSDG4S2 kernel: pid 871 (mountd), jid 0, uid 0:
> > exited on signal 6 (core dumped) Jun 11 16:03:40 FBSDG4S2 kernel:
> > pid 1065 (su), jid 0, uid 0: exited on signal 6 Jun 11 16:04:13
> > FBSDG4S2 kernel: pid 1088 (su), jid 0, uid 0: exited on signal 6
> > Jun 11 16:04:28 FBSDG4S2 kernel: pid 968 (sshd), jid 0, uid 0:
> > exited on signal 6
> >
> > Jun 11 16:05:42 FBSDG4S2 kernel: pid 1028 (login), jid 0, uid 0:
> > exited on signal 6
> >
> > Jun 11 16:05:46 FBSDG4S2 kernel: pid 873 (nfsd), jid 0, uid 0:
> > exited on signal 6 (core dumped)
> >
> >
> > Rebooting and rerunning and showing the stress output and such
> > (I did not capture copies during the first test, but the first
> > test had similar messages at the same sort of points):
> >
> > Second test . . .
> >
> > # stress -m 2 --vm-bytes 1700M
> > stress: info: [1166] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
> > <jemalloc>:
> > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:
> > Failed assertion: "slab == extent_slab_get(extent)" <jemalloc>:
> > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:
> > Failed assertion: "slab == extent_slab_get(extent)" ^C
> >
> > # exit
> > <jemalloc>:
> > /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200:
> > Failed assertion: "ret == sz_index2size_compute(index)" Abort trap
> >
> > The other stuff was similar to to first test, not repeated here.
>
> The updated code looks odd to me for how "m" is
> handled (part of a egrep to ensure I show all the
> usage of m):
>
> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
> vm_page_t m;
> if (pm != kernel_pmap && m != NULL &&
> (m->a.flags & PGA_EXECUTABLE) == 0 &&
> if ((m->oflags & VPO_UNMANAGED) == 0)
> vm_page_aflag_set(m,
> PGA_EXECUTABLE); m = PHYS_TO_VM_PAGE(old_pte.pte_lo & PTE_RPGN);
> refchg =
> atomic_readandclear_32(&m->md.mdpg_attrs); vm_page_dirty(m);
> vm_page_aflag_set(m,
> PGA_REFERENCED);
>
> Or more completely, with notes mixed in:
>
> void
> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
> vm_prot_t prot)
> {
> . . .
> vm_page_t m;
> . . .
> for (pvo = RB_NFIND(pvo_tree, &pm->pmap_pvo, &key);
> pvo != NULL && PVO_VADDR(pvo) < eva; pvo = tpvo) {
> . . .
> if (pt != NULL) {
> . . .
> if (pm != kernel_pmap && m != NULL &&
>
> NOTE: m seems to be uninitialized but tested for being NULL
> above.
>
> (m->a.flags & PGA_EXECUTABLE) == 0 &&
>
> Note: This looks to potentially be using a random, non-NULL
> value for m during evaluation of m->a.flags .
>
> . . .
>
> if ((pvo->pvo_vaddr & PVO_MANAGED) &&
> (pvo->pvo_pte.prot & VM_PROT_WRITE)) {
> m = PHYS_TO_VM_PAGE(old_pte.pte_lo &
> PTE_RPGN);
>
> Note: m finally is potentially initialized(/set).
>
> refchg =
> atomic_readandclear_32(&m->md.mdpg_attrs); if (refchg & PTE_CHG)
> vm_page_dirty(m);
> if (refchg & PTE_REF)
> vm_page_aflag_set(m,
> PGA_REFERENCED); . . .
>
> Note: So, if m is set above, then the next loop
> iteration(s) would use this then-old value
> instead of an initialized value.
>
> It looks to me like at least one assignment
> to m is missing.
>
> moea64_pvo_protect has pg that seems analogous to
> m and has:
>
> pg = PHYS_TO_VM_PAGE(pvo->pvo_pte.pa & LPTE_RPGN);
> . . .
> if (pm != kernel_pmap && pg != NULL &&
> (pg->a.flags & PGA_EXECUTABLE) == 0 &&
> (pvo->pvo_pte.pa & (LPTE_I | LPTE_G | LPTE_NOEXEC)) == 0)
> { if ((pg->oflags & VPO_UNMANAGED) == 0)
> vm_page_aflag_set(pg, PGA_EXECUTABLE);
>
> . . .
> if (pg != NULL && (pvo->pvo_vaddr & PVO_MANAGED) &&
> (oldprot & VM_PROT_WRITE)) {
> refchg |= atomic_readandclear_32(&pg->md.mdpg_attrs);
> if (refchg & LPTE_CHG)
> vm_page_dirty(pg);
> if (refchg & LPTE_REF)
> vm_page_aflag_set(pg, PGA_REFERENCED);
>
>
> This might suggest some about what is missing.
Can you try moving the assignment to 'm' to right below the
moea_pte_change() call?
- Justin
>
>
> ===
> Mark Millard
> marklmi at yahoo.com
> ( dsl-only.net went
> away in early 2018-Mar)
>
More information about the freebsd-ppc
mailing list