Re: madvise(MADV_FREE) doesn't work in some cases?

From: Vitaliy Gusev <gusev.vitaliy_at_gmail.com>
Date: Mon, 05 Jul 2021 16:32:00 UTC
Hi,
> > Does it mean madvise() doesn't work well in FreeBSD or test does something wrong?
> 
> Your program does not exactly what you described above.  There is a generic
> race to consume memory, and some specific details about madvise(2) on FreeBSD.
> 
> >From the code, you do:
> - mmap anonymous private region
> - fork
> - both child and parent start touching the mmaped region.
> 
> Two processes race to consume 1/2 of RAM on your system.  If one of
> them happen to execute faster then another, you do get to the case where
> one of them does madvise().  But it could be that processes execute in
> lockstep, and try to eat all the memory before going to madvise().
> Did you excluded this case?
I believe I did all things right. You can see sleeps that serialise execution. To check again I modified test and added time printing and use MADV_DONTNEED:

Here is source  http://cpp.sh/2rd4f <http://cpp.sh/2rd4f>

I’ve run: 

$ ./mmapfork 2300
mmap 0x801000000 pid 40628
end 0x890c00000 len 0x8fc00000
pid 40628
pid 40629
40629: [1625500831] touch
40629: [1625500832] sleep before madvise
40629: [1625500833] madvise
40629: [1625500834] Press enter to exit
40628: [1625500845] touch
40628: [1625500846] sleep before madvise
40628: [1625500851] madvise
40628: [1625500852] Press enter to exit

And you can see that child started running in 11 seconds after parent had already called madvise() for all scope of touched memory.

And finally in dmesg:

pid 40629 (mmapfork), jid 0, uid 1001, was killed: out of swap space

So the same result as I wrote in the first email.

> Now, about the specific of madvise(MADV_FREE) on FreeBSD.  Due to the way
> CoW is implemented with the shadow chain of objects, we cannot drop the
> top of the shadow chain, otherwise instead of returning zeroed pages next
> time, we would return content back in the time.  It was relatively recent
> discovery, see bf5661f4a1af6931ec4b6, PR 240061.
> 
Thanks, I will look at it.
> To explain it in simplified form, when there is potential old content
> under the CoW copy for the mapping, we cannot drop CoW-ed pages. This
> is the motivation why madvise(MADV_FREE) does nothing for your program.
> When you run two instances without fork, there is no previous content
> and no Cow, so madvise() can safely remove the pages from the object,
> and on the next access they are zero-filled.

Do I understand right, that it should work with MADV_DONTNEED? But “dontneed" variant doesn’t work. 
> 
> You can read more details in the referenced commit, as well as some musings
> about way to make it somewhat better.
> 
> I must say, that trying to allocated 1/2 + 1/2 of RAM this way, on a system
> without swap, is the way to ask for troubles anyway.
I’ve just notify that other operation systems work well with that, whereas FreeBSD has troubles. Probably something in madvise() is not finished ?

----
Vitaliy Gusev