head -r315870 (e.g.): fork-then-swap-out [zero RES(ident memory)] questions tied to arm64 failures (zeroed memory pages)
Mark Millard
markmi at dsl-only.net
Fri Mar 24 09:16:20 UTC 2017
On 2017-Mar-22, at 3:09 PM, Mark Millard <markmi at dsl-only.net> wrote:
> The later questions are associated with:
>
> Bugzilla 217239 and 217138 (which I now expect have a common cause)
> https://lists.freebsd.org/pipermail/freebsd-arm/2017-March/015867.html
> (and its thread)
>
> These are tied to some process memory pages being trashed (to
> be zero) in particular types of arm64 contexts. This is
> reproducible in multiple arm64 contexts. The context is head
> but I believe there are reports in the lists tied to 11 as
> well.
>
> [Unfortunately the above all very much shows a learn-as-I-go
> property. Also the list has a sub-exchange on my testing other
> devices to check for device failures that is not directly
> relevant here.]
>
> These are tied to problems with fork-then-swap-out-then-swap-in
> contexts on arm64. (Even though I've occasionally typed amd64
> accidentally in places in those materials.) Memory allocations
> from before the fork are involved, ones not yet accessed by
> the child side of the fork at the time of the fork.
>
> fork sets up copy-on-write so that the child process temporarily
> shares pages (those it does not write), or should.
>
> But what if the parent process or both parent and child are
> swapped-out just shortly after the fork (so, say, top -PCwaopid
> shows zero for RES(ident memory)? What is the handling of, say,
> the child swapping back in while the parent still is swapped
> out?
>
> I notice that the child can have a much smaller SWAP figure
> than the parent so it would appear that the parent swap-out
> has pages that the child does not.
>
> So what if the child needs some of those pages? What should
> happen? (Vs. what does happen on arm64 in specific types
> of contexts? More below.)
>
> I ask mostly to see if I can do more to give evidence of
> what is going on and what to test for any proposed fix.
> I'm not likely to find the underlying problem(s) for arm64
> directly, unlike my investigation that lead to
> fork-trampoline being fixed in head's -r313772
> (2017-Feb-15).
>
> [ https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015656.html
> and its thread, including when its title changed in:
> https://lists.freebsd.org/pipermail/freebsd-arm/2017-February/015678.html
> .]
>
> Part of that unlikely-to-solve status is because the
> context seems to be bound to a lot of special conditions
> and interesting behaviors simultaneously:
>
> A) Both my original reproductions of problem reports on the
> lists and the only (simple) programs for reproducing the
> probablems involve fork-then-swap-out [zero RES(ident
> memory)]. Neither fork by itself nor swap-out/in by
> itself have been sufficient.
>
> B) jemalloc's tcache being in use (__je_tcache_maxclass == 32*1024)
> is part of every example of reproduction of the problem.
>
> C) allocations <= SMALL_MAXCLASS (SMALL_MAXCLASS==14*1024) get
> the failure (but bigger ones work, both fitting inside
> __je_tcache_maxclass and not). Again: every example
> reproduction of the problem has this status.
>
> D) FreeBSD sometimes gets into a state where /etc/malloc.conf
> doing tcache:false does not seem to disable tcache. (Rebooting
> goes back to tcache:false working after such has been
> observed.) [Related or independent? I've no clue.] Usually
> tcache:false seems to work and so avoid the failures.
>
> E) Use of POSIX_MADV_WILLNEED on the problematical allocation(s)
> in the child process after the fork but before the swap-outs
> of the child and parent prevents the failures (no read or
> write access to the memory from the child until after the
> swap-in). Doing so just in the parent process does not prevent
> the failures.
>
> F) Similar to (E) but read-accessing a byte or more of one or
> more pages from the problematical allocations from the child
> process after the fork but before the swap-out makes those
> specific pages not fail. (The others still fail, if any.)
> Done from the parent process instead does not avoid the
> failures.
>
> G) In a sequence like: su creates a sh which then runs one
> of my test programs that then forks off a child it can be
> that all of the 4 processes show the zeroed memory area
> like the child process does. su and sh need to have
> swapped-out and back in for them to get failures. su and
> sh die once they hit an assert that fails due to the zeroed
> memory page(s). The asserts involve addresses also messed
> up in the test program processes (parent and child).
>
> In my reading I've not been able to determine what to expect
> for fork-then-swap-out-and-back-in for pages that the child
> process had not accessed yet but which might not be around
> for later activity because of the parent process's own
> swapped-out status at the time.
>
> Note: While I usually run a non-debug kernel I've tried
> a debug kernel and it provided no notices of problems. I
> got no additional information from the attempt.
>
> [My usual KERNCONF file includes GENERIC and then disables
> various debug items.]
>
> The bugzilla reports have example code for showing the
> problems and various behaviors. The two in 217239 are
> probably of more interest than the first one on 217138.
I just updated the pine64+ 2GB to head -r315870 and it
still gets the trashed-with-zeros pages from sequences
such as:
allocation(s) (tcache in use with fitting <= SMALL_MAXCLASS)
initialize them (to non-zero bytes)
fork
sleep/wait then swap-out [zero RES(ident memory)]
(both parent and child in my tests)
(Note the lack of access so far on the child process
side.)
swap-in
After swap-in both the Child and Parent see the indicated
allocation(s) as having only zero bytes instead of the
initialization values.
===
Mark Millard
markmi at dsl-only.net
More information about the freebsd-arm
mailing list