[Bug 217138] head (e.g.) -r315870 for arm64: sh vs. jemalloc asserts: include/jemalloc/internal/tsd.h:687: Failed assertion: "tsd_booted" once swapped in after being swapped out (comment 10)
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Sun Apr 9 00:41:43 UTC 2017
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=217138
--- Comment #36 from Mark Millard <markmi at dsl-only.net> ---
(In reply to Mark Millard from comment #35)
I've been able to identify what code sequence
is gradually removing the "small_mappings" via
some breakpointing in the kernel after reaching
the "should be just sleeping" status. Specifically
I started with breakpointing when
pmap_resident_count_dec was on the call stack
in order to see the call chain(s) that lead to
it being called while RES(ident memory) is
gradually decreasing during the sleep that
is just before forking.
(tid 100067 is [pagedaemon{pagedaemon}], which
is in vm_pageout_worker. bt does not show inlined
layers.)
[ thread pid 17 tid 100067 ]
Breakpoint at $x.1: undefined d65f03c0
db> bt
Tracing pid 17 tid 100067 td 0xfffffd0001c4aa00
. . .
handle_el1h_sync() at pmap_remove_l3+0xdc
pc = 0xffff000000604870 lr = 0xffff000000611158
sp = 0xffff000083a49980 fp = 0xffff000083a49a40
pmap_remove_l3() at pmap_ts_referenced+0x580
pc = 0xffff000000611158 lr = 0xffff000000615c50
sp = 0xffff000083a49a50 fp = 0xffff000083a49ac0
pmap_ts_referenced() at vm_pageout+0xe60
pc = 0xffff000000615c50 lr = 0xffff0000005d1f74
sp = 0xffff000083a49ad0 fp = 0xffff000083a49b50
vm_pageout() at fork_exit+0x94
pc = 0xffff0000005d1f74 lr = 0xffff0000002e01c0
sp = 0xffff000083a49b60 fp = 0xffff000083a49b90
fork_exit() at fork_trampoline+0x10
pc = 0xffff0000002e01c0 lr = 0xffff0000006177b4
sp = 0xffff000083a49ba0 fp = 0x0000000000000000
It turns out that pmap_ts_referenced is on its:
small_mappings:
. . .
path for the above so the pmap_remove_l3 call is
the one from that execution path. (Found by more
breakpointing after enabling such on the paths.)
So this is the path with:
(breakpoint hook not shown)
/*
* Wired pages cannot be paged out so
* doing accessed bit emulation for
* them is wasted effort. We do the
* hard work for unwired pages only.
*/
pmap_remove_l3(pmap, pte, pv->pv_va, tpde,
&free, &lock);
pmap_invalidate_page(pmap, pv->pv_va);
cleared++;
if (pvf == pv)
pvf = NULL;
pv = NULL;
. . .
>From what I can tell this code is eliminating the
content of pages that in the failing tests, ones
with no backing store yet (not swapped-out yet).
The observed behavior is that the pages that have
the above happen end up as zero pages once
swapped-out and back in.
I do not see anything putting the pages that this
happens to into any other lists to keep track of
the contents of the page content. The swap-out
and swap-in seem to have ignored these pages and
to have been based on automatically zeroed pages
instead.
Note that the (or a) question might be if these
pages should have ever gotten to this code at
all. (I'm no expert overall.) But that might
get into why POSIX_MADV_WILLNEED spanning each
page is sufficient to avoid the zeros issue for
work-then-swapout-and-back-in. I'll only write
here about what the backtrace code seems to be
doing if I'm interpreting correctly.
One oddity here is that pmap_remove_l3 does its own
pmap_invalidate_page to invalidate the same tlb entry as
the above pmap_invalidate_page, so a double-invalidate.
(I've no clue if such is just suboptimal vs. a form of
error.)
pmap_remove_l3 here does things that the analogous
sys/arm/arm/pmap-v6.c's pmap_ts_referenced does not
do and pmap-v6 does something this code does not.
arm64's pmap_remove_l3 does (in summary):
pmap_invalidate_page
decrements the resident_count
pmap_unwire_l3
(then pmap_ts_referenced's small_mappings code
does another pmap_invalidate_page for the
same argument values)
arm pmap-v6's pmap_ts_referenced's small_mappings
code does:
conditional vm_page_dirty
pte2_clear_bit for PTE2_A
pmap_tlb_flush
There is, for example, no decrement of the
resident_count involved (that I found anyway).
But I've no clue just what should be analogous
vs. what should not between pmap-v6 and arm64's
pmap code in this area.
I'll also note that the code before the
arm64 small_mappings code also uses
pmap_remove_l3 but does not do the
decrement nor the extra pmap_invalidate_page
(for example). But again I do not know
how analogous the two paths should be.
Only the small_mappings path seems to have the
end-up-with-zeros problem for the later
fork-then-swap-out and then swap-back-in
context.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the freebsd-amd64
mailing list