arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!]
Mark Millard
markmi at dsl-only.net
Sun Mar 19 04:10:43 UTC 2017
On 2017-Mar-18, at 5:53 PM, Mark Millard <markmi at dsl-only.net> wrote:
> A new, significant discovery follows. . .
>
> While checking out use of procstat -v I ran
> into the following common property for the 3
> programs that I looked at:
>
> A) My small test program that fails for
> a dynamically allocated space.
>
> B) sh reporting Failed assertion: "tsd_booted".
>
> C) su reporting Failed assertion: "tsd_booted".
>
> Here are example addresses from the area of
> incorrectly zeroed memory (A then B then C):
>
> (lldb) print dyn_region
> (region *volatile) $0 = 0x0000000040616000
>
> (lldb) print &__je_tsd_booted
> (bool *) $0 = 0x0000000040618520
>
> (lldb) print &__je_tsd_booted
> (bool *) $0 = 0x0000000040618520
That last above was a copy/paste error. Correction:
(lldb) print &__je_tsd_booted
(bool *) $0 = 0x000000004061d520
> The first is from dynamic allocation ending up
> in the area. The other two are from libc.so.7
> globals/statics ending up in the general area.
>
> It looks like something is trashing a specific
> memory area for some reason, rather independently
> of what the program specifics are.
>
>
> Other notes:
>
> At least for my small program showing failure:
>
> Being explicit about the combined conditions for failure
> for my test program. . .
>
> Both tcache enabled and allocations fitting in SMALL_MAXCLASS
> are required in order to make the program fail.
>
> Note:
>
> lldb) print __je_tcache_maxclass
> (size_t) $0 = 32768
>
> which is larger than SMALL_MAXCLASS. I've not observed
> failures for sizes above SMALL_MAXCLASS but not exceeding
> __je_tcache_maxclass.
>
> Thus tcache use by itself does not seen sufficient for
> my program to get corruption of its dynamically allocated
> memory: the small allocation size also matters.
>
>
> Be warned that I can not eliminate the possibility that
> the trashing changed what region of memory it trashed
> for larger allocations or when tcache is disabled.
The pine64+ 2GB eventually got into a state where:
/etc/malloc.conf -> tcache:false
made no difference and the failure kept occurring
with that symbolic link in place.
But after a reboot of the pin46+ 2GB
/etc/malloc.conf -> tcache:false was again effective
for my test program. (It was still present from
before the reboot.)
I checked the .core files and the allocated address
assigned to dyn_region was the same in the tries
before and after the reboot. (I had put in an
additional raise(SIGABRT) so I'd always have
a core file to look at.)
Apparently /etc/malloc.conf -> tcache:false was
being ignored before the reboot for some reason?
===
Mark Millard
markmi at dsl-only.net
More information about the freebsd-arm
mailing list