arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!]
Mark Millard
markmi at dsl-only.net
Wed Mar 15 04:33:12 UTC 2017
A single Byte access to a 4K Byte aligned region between
the fork and wait/sleep/swap-out prevents that specific
4K Byte region from having the (bad) zeros.
Sounds like a page sized unit of behavior to me.
Details follow.
On 2017-Mar-14, at 3:28 PM, Mark Millard <markmi at dsl-only.net> wrote:
> [test_check() between the fork and the wait/sleep prevents the
> failure from occurring. Even a small access to the memory at
> that stage prevents the failure. Details follow.]
>
> On 2017-Mar-14, at 11:07 AM, Mark Millard <markmi at dsl-only.net> wrote:
>
>> [This is just a correction to the subject-line text to say arm64
>> instead of amd64.]
>>
>> On 2017-Mar-14, at 12:58 AM, Mark Millard <markmi at dsl-only.net> wrote:
>>
>> [Another correction I'm afraid --about alternative program variations
>> this time.]
>>
>> On 2017-Mar-13, at 11:52 PM, Mark Millard <markmi at dsl-only.net> wrote:
>>
>>> I'm still at a loss about how to figure out what stages are messed
>>> up. (Memory coherency? Some memory not swapped out? Bad data swapped
>>> out? Wrong data swapped in?)
>>>
>>> But at least I've found a much smaller/simpler example to demonstrate
>>> some problem with in my Pine64+_ 2GB context.
>>>
>>> The Pine64+ 2GB is the only amd64 context that I have access to.
>>
>> Someday I'll learn to type arm64 the first time instead of amd64.
>>
>>> The following program fails its check for data
>>> having its expected byte pattern in dynamically
>>> allocated memory after a fork/swap-out/swap-in
>>> sequence.
>>>
>>> I'll note that the program sleeps for 60s after
>>> forking to give time to do something else to
>>> cause the parent and child processes to swap
>>> out (RES=0 as seen in top).
>>
>> The following about the extra test_check() was
>> wrong.
>>
>>> Note the source code line:
>>>
>>> // test_check(); // Adding this line prevents failure.
>>>
>>> It seem that accessing the region contents before forking
>>> and swapping avoids the problem. But there is a problem
>>> if the region was only written-to before the fork/swap.
>
> There is a place that if a test_check call is put then the
> problem does not happen at any stage: I tried putting a
> call between the fork and the later wait/sleep code:
I changed the byte sequence patterns to avoid
zero values since the bad values are zeros:
static value_type value(size_t v) { return (value_type)((v&0xFEu)|0x1u); }
// value now avoids the zero value since the failures
// are zeros.
With that I can then test accurately what bytes have
bad values vs. do not. I also changed to:
void partial_test_check(void) {
if (value(0u)!=gbl_region.array[0]) raise(SIGABRT);
if (value(0u)!=(*dyn_region).array[0]) raise(SIGABRT);
}
since previously [0] had a zero value and so I'd used [1].
On this basis I'm now using the below. See the comments tied
to partial_test_check() calls:
extern void test_setup(void); // Sets up the memory byte patterns.
extern void test_check(void); // Tests the memory byte patterns.
extern void partial_test_check(void); // Tests just [0] of each region
// (gbl_region and dyn_region).
int main(void) {
test_setup();
test_check(); // Before fork() [passes]
pid_t pid = fork();
int wait_status = 0;;
// After fork; before waitsleep/swap-out.
if (0==pid) partial_test_check();
// Even the above is sufficient by
// itself to prevent failure for
// region_size 1u through
// 4u*1024u!
// But 4u*1024u+1u and above fail
// with this access to memory.
// The failing test is of
// (*dyn_region).array[4096u].
// This test never fails here.
if (0<pid) partial_test_check(); // This never prevents
// later failures (and
// never fails here).
if (0<pid) { wait(&wait_status); }
if (-1!=wait_status && 0<=pid) {
if (0==pid) {
sleep(60);
// During this manually force this process to
// swap out. I use something like:
// stress -m 1 --vm-bytes 1800M
// in another shell and ^C'ing it after top
// shows the swapped status desired. 1800M
// just happened to work on the Pine64+ 2GB
// that I was using. I watch with top -PCwaopid .
}
test_check(); // After wait/sleep [fails for small-enough region_sizes]
}
}
> This suggests to me that the small access is forcing one or more things to
> be initialized for memory access that fork is not establishing of itself.
> It appears that if established correctly then the swap-out/swap-in
> sequence would work okay without needing the manual access to the memory.
>
>
> So far via this test I've not seen any evidence of problems with the global
> region but only the dynamically allocated region.
>
> However, the symptoms that started this investigation in a much more
> complicated context had an area of global memory from a .so that ended
> up being zero.
>
> I think that things should be fixed for this simpler context first and
> that further investigation of the sh/su related should wait to see what
> things are like after this test case works.
===
Mark Millard
markmi at dsl-only.net
More information about the freebsd-current
mailing list