svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

Fri Jun 12 03:29:34 UTC 2020

On 2020-Jun-11, at 19:25, Justin Hibbits <chmeeedalf at gmail.com> wrote:

> On Thu, 11 Jun 2020 17:30:24 -0700
> Mark Millard <marklmi at yahoo.com> wrote:
> 
>> On 2020-Jun-11, at 16:49, Mark Millard <marklmi at yahoo.com> wrote:
>> 
>>> On 2020-Jun-11, at 14:42, Justin Hibbits <chmeeedalf at gmail.com>
>>> wrote:
>>> 
>>> On Thu, 11 Jun 2020 14:36:37 -0700
>>> Mark Millard <marklmi at yahoo.com> wrote:
>>> 
>>>> On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com>
>>>> wrote:
>>>> 
>>>>> On Wed, 10 Jun 2020 18:56:57 -0700
>>>>> Mark Millard <marklmi at yahoo.com> wrote:  
>>> . . .  
>>>> 
>>>> 
>>>>> That said, the attached patch effectively copies
>>>>> what's done in OEA6464 into OEA pmap.  Can you test it?    
>>>> 
>>>> I'll try it once I get a chance, probably later
>>>> today.
>>>> . . .  
>>> 
>>> No luck at the change being a fix, I'm afraid.
>>> 
>>> I verified that the build ended up with
>>> 
>>> 00926cb0 <moea_protect+0x2ec> bl      008e8dc8 <PHYS_TO_VM_PAGE>
>>> 00926cb4 <moea_protect+0x2f0> mr      r27,r3
>>> 00926cb8 <moea_protect+0x2f4> addi    r3,r3,36
>>> 00926cbc <moea_protect+0x2f8> hwsync
>>> 00926cc0 <moea_protect+0x2fc> lwarx   r25,0,r3
>>> 00926cc4 <moea_protect+0x300> li      r4,0
>>> 00926cc8 <moea_protect+0x304> stwcx.  r4,0,r3
>>> 00926ccc <moea_protect+0x308> bne-    00926cc0 <moea_protect+0x2fc>
>>> 00926cd0 <moea_protect+0x30c> andi.   r3,r25,128
>>> 00926cd4 <moea_protect+0x310> beq     00926ce0 <moea_protect+0x31c>
>>> 00926cd8 <moea_protect+0x314> mr      r3,r27
>>> 00926cdc <moea_protect+0x318> bl      008e9874 <vm_page_dirty_KBI>
>>> 
>>> in the installed kernel. So I doubt a
>>> mis-build would be involved. It is a
>>> head -r360311 based context still. World is
>>> without MALLOC_PRODUCTION so that jemalloc
>>> code executes its asserts, catching more
>>> and earlier than otherwise.
>>> 
>>> First test . . .
>>> 
>>> The only thing that the witness kernel reported was:
>>> 
>>> Jun 11 15:58:16 FBSDG4S2 kernel: lock order reversal:
>>> Jun 11 15:58:16 FBSDG4S2 kernel:  1st 0x216fb00 Mountpoints (UMA
>>> zone) @ /usr/src/sys/vm/uma_core.c:4387 Jun 11 15:58:16 FBSDG4S2
>>> kernel:  2nd 0x1192d2c kernelpmap (kernelpmap) @
>>> /usr/src/sys/powerpc/aim/mmu_oea.c:1524 Jun 11 15:58:16 FBSDG4S2
>>> kernel: stack backtrace: Jun 11 15:58:16 FBSDG4S2 kernel: #0
>>> 0x5ec164 at witness_debugger+0x94 Jun 11 15:58:16 FBSDG4S2 kernel:
>>> #1 0x5ebe3c at witness_checkorder+0xb50 Jun 11 15:58:16 FBSDG4S2
>>> kernel: #2 0x536d5c at __mtx_lock_flags+0xcc Jun 11 15:58:16
>>> FBSDG4S2 kernel: #3 0x92636c at moea_kextract+0x5c Jun 11 15:58:16
>>> FBSDG4S2 kernel: #4 0x965d30 at pmap_kextract+0x98 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #5 0x8bfdbc at zone_release+0xf0 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #6 0x8c7854 at bucket_drain+0x2f0 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #7 0x8c728c at bucket_free+0x54 Jun 11 15:58:16
>>> FBSDG4S2 kernel: #8 0x8c74fc at bucket_cache_reclaim+0x1bc Jun 11
>>> 15:58:16 FBSDG4S2 kernel: #9 0x8c7004 at zone_reclaim+0x128 Jun 11
>>> 15:58:16 FBSDG4S2 kernel: #10 0x8c3a40 at uma_reclaim+0x170 Jun 11
>>> 15:58:16 FBSDG4S2 kernel: #11 0x8c3f70 at uma_reclaim_worker+0x68
>>> Jun 11 15:58:16 FBSDG4S2 kernel: #12 0x50fbac at fork_exit+0xb0 Jun
>>> 11 15:58:16 FBSDG4S2 kernel: #13 0x9684ac at fork_trampoline+0xc
>>> 
>>> The processes that were hit were listed as:
>>> 
>>> Jun 11 15:59:11 FBSDG4S2 kernel: pid 971 (cron), jid 0, uid 0:
>>> exited on signal 11 (core dumped) Jun 11 16:02:59 FBSDG4S2 kernel:
>>> pid 1111 (stress), jid 0, uid 0: exited on signal 6 (core dumped)
>>> Jun 11 16:03:27 FBSDG4S2 kernel: pid 871 (mountd), jid 0, uid 0:
>>> exited on signal 6 (core dumped) Jun 11 16:03:40 FBSDG4S2 kernel:
>>> pid 1065 (su), jid 0, uid 0: exited on signal 6 Jun 11 16:04:13
>>> FBSDG4S2 kernel: pid 1088 (su), jid 0, uid 0: exited on signal 6
>>> Jun 11 16:04:28 FBSDG4S2 kernel: pid 968 (sshd), jid 0, uid 0:
>>> exited on signal 6
>>> 
>>> Jun 11 16:05:42 FBSDG4S2 kernel: pid 1028 (login), jid 0, uid 0:
>>> exited on signal 6
>>> 
>>> Jun 11 16:05:46 FBSDG4S2 kernel: pid 873 (nfsd), jid 0, uid 0:
>>> exited on signal 6 (core dumped)
>>> 
>>> 
>>> Rebooting and rerunning and showing the stress output and such
>>> (I did not capture copies during the first test, but the first
>>> test had similar messages at the same sort of points):
>>> 
>>> Second test . . .
>>> 
>>> # stress -m 2 --vm-bytes 1700M
>>> stress: info: [1166] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd
>>> <jemalloc>:
>>> /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:
>>> Failed assertion: "slab == extent_slab_get(extent)" <jemalloc>:
>>> /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:
>>> Failed assertion: "slab == extent_slab_get(extent)" ^C
>>> 
>>> # exit
>>> <jemalloc>:
>>> /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200:
>>> Failed assertion: "ret == sz_index2size_compute(index)" Abort trap
>>> 
>>> The other stuff was similar to to first test, not repeated here.  
>> 
>> The updated code looks odd to me for how "m" is
>> handled (part of a egrep to ensure I show all the
>> usage of m):
>> 
>> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
>>        vm_page_t       m;
>>                        if (pm != kernel_pmap && m != NULL &&
>>                            (m->a.flags & PGA_EXECUTABLE) == 0 &&
>>                                if ((m->oflags & VPO_UNMANAGED) == 0)
>>                                        vm_page_aflag_set(m,
>> PGA_EXECUTABLE); m = PHYS_TO_VM_PAGE(old_pte.pte_lo & PTE_RPGN);
>>                                refchg =
>> atomic_readandclear_32(&m->md.mdpg_attrs); vm_page_dirty(m);
>>                                        vm_page_aflag_set(m,
>> PGA_REFERENCED);
>> 
>> Or more completely, with notes mixed in:
>> 
>> void 
>> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva,
>>    vm_prot_t prot)
>> {
>>        . . .
>>        vm_page_t       m;
>>        . . .
>>        for (pvo = RB_NFIND(pvo_tree, &pm->pmap_pvo, &key);
>>            pvo != NULL && PVO_VADDR(pvo) < eva; pvo = tpvo) {
>>                . . .
>>                if (pt != NULL) {
>>                        . . .
>>                        if (pm != kernel_pmap && m != NULL &&
>> 
>> NOTE: m seems to be uninitialized but tested for being NULL
>> above.
>> 
>>                            (m->a.flags & PGA_EXECUTABLE) == 0 &&
>> 
>> Note: This looks to potentially be using a random, non-NULL
>> value for m during evaluation of m->a.flags .
>> 
>>                        . . .
>> 
>>                        if ((pvo->pvo_vaddr & PVO_MANAGED) &&
>>                            (pvo->pvo_pte.prot & VM_PROT_WRITE)) {
>>                                m = PHYS_TO_VM_PAGE(old_pte.pte_lo &
>> PTE_RPGN);
>> 
>> Note: m finally is potentially initialized(/set).
>> 
>>                                refchg =
>> atomic_readandclear_32(&m->md.mdpg_attrs); if (refchg & PTE_CHG)
>>                                        vm_page_dirty(m);
>>                                if (refchg & PTE_REF)
>>                                        vm_page_aflag_set(m,
>> PGA_REFERENCED); . . .
>> 
>> Note: So, if m is set above, then the next loop
>> iteration(s) would use this then-old value
>> instead of an initialized value.
>> 
>> It looks to me like at least one assignment
>> to m is missing.
>> 
>> moea64_pvo_protect has pg that seems analogous to
>> m and has:
>> 
>>        pg = PHYS_TO_VM_PAGE(pvo->pvo_pte.pa & LPTE_RPGN);
>> . . .
>>        if (pm != kernel_pmap && pg != NULL &&
>>            (pg->a.flags & PGA_EXECUTABLE) == 0 &&
>>            (pvo->pvo_pte.pa & (LPTE_I | LPTE_G | LPTE_NOEXEC)) == 0)
>> { if ((pg->oflags & VPO_UNMANAGED) == 0)
>>                        vm_page_aflag_set(pg, PGA_EXECUTABLE);
>> 
>> . . .
>>        if (pg != NULL && (pvo->pvo_vaddr & PVO_MANAGED) &&
>>            (oldprot & VM_PROT_WRITE)) {
>>                refchg |= atomic_readandclear_32(&pg->md.mdpg_attrs);
>>                if (refchg & LPTE_CHG)
>>                        vm_page_dirty(pg);
>>                if (refchg & LPTE_REF)
>>                        vm_page_aflag_set(pg, PGA_REFERENCED);
>> 
>> 
>> This might suggest some about what is missing.
> 
> Can you try moving the assignment to 'm' to right below the
> moea_pte_change() call?

Panics during boot. svnlite diff shown later.

That change got me a panic just after the lines about ada0
and cd0 details. (Unknown what internal stage.) Hand
translated from a picture of the screen:

panic: vm_page_free_prep: mapping flags set in page 0xd032a078
. . .
panic
vm_page_free_prep
vm_page_free_toq
vm_page_free
vm_object_collapse
vm_object_deallocate
vm_map_process_deferred
vm_map_remove
exec_new_vmspace
exec_elf32_imgact
kern_execve
sys_execve
trap
powerpc_interrupt
user SC trap by 0x100d7af8 . . .




# svnlite diff /usr/src/sys/powerpc/aim/mmu_oea.c
Index: /usr/src/sys/powerpc/aim/mmu_oea.c
===================================================================

--- /usr/src/sys/powerpc/aim/mmu_oea.c	(revision 360322)
+++ /usr/src/sys/powerpc/aim/mmu_oea.c	(working copy)
@@ -1773,6 +1773,9 @@
 {
 	struct	pvo_entry *pvo, *tpvo, key;
 	struct	pte *pt;
+	struct	pte old_pte;
+	vm_page_t	m;
+	int32_t	refchg;
 
 	KASSERT(pm == &curproc->p_vmspace->vm_pmap || pm == kernel_pmap,
 	    ("moea_protect: non current pmap"));
@@ -1800,12 +1803,31 @@
 		pvo->pvo_pte.pte.pte_lo &= ~PTE_PP;
 		pvo->pvo_pte.pte.pte_lo |= PTE_BR;
 
+		old_pte = *pt;
+
 		/*
 		 * If the PVO is in the page table, update that pte as well.
 		 */
 		if (pt != NULL) {
 			moea_pte_change(pt, &pvo->pvo_pte.pte, pvo->pvo_vaddr);
+			m = PHYS_TO_VM_PAGE(old_pte.pte_lo & PTE_RPGN);
+			if (pm != kernel_pmap && m != NULL &&
+			    (m->a.flags & PGA_EXECUTABLE) == 0 &&
+			    (pvo->pvo_pte.pa & (PTE_I | PTE_G)) == 0) {
+				if ((m->oflags & VPO_UNMANAGED) == 0)
+					vm_page_aflag_set(m, PGA_EXECUTABLE);
+				moea_syncicache(pvo->pvo_pte.pa & PTE_RPGN,
+				    PAGE_SIZE);
+			}
 			mtx_unlock(&moea_table_mutex);
+			if ((pvo->pvo_vaddr & PVO_MANAGED) &&
+			    (pvo->pvo_pte.prot & VM_PROT_WRITE)) {
+				refchg = atomic_readandclear_32(&m->md.mdpg_attrs);
+				if (refchg & PTE_CHG)
+					vm_page_dirty(m);
+				if (refchg & PTE_REF)
+					vm_page_aflag_set(m, PGA_REFERENCED);
+			}
 		}
 	}
 	rw_wunlock(&pvh_global_lock);


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)