Re: madvise(MADV_FREE) doesn't work in some cases?
Date: Mon, 05 Jul 2021 16:32:00 UTC
Hi, > > Does it mean madvise() doesn't work well in FreeBSD or test does something wrong? > > Your program does not exactly what you described above. There is a generic > race to consume memory, and some specific details about madvise(2) on FreeBSD. > > >From the code, you do: > - mmap anonymous private region > - fork > - both child and parent start touching the mmaped region. > > Two processes race to consume 1/2 of RAM on your system. If one of > them happen to execute faster then another, you do get to the case where > one of them does madvise(). But it could be that processes execute in > lockstep, and try to eat all the memory before going to madvise(). > Did you excluded this case? I believe I did all things right. You can see sleeps that serialise execution. To check again I modified test and added time printing and use MADV_DONTNEED: Here is source http://cpp.sh/2rd4f <http://cpp.sh/2rd4f> I’ve run: $ ./mmapfork 2300 mmap 0x801000000 pid 40628 end 0x890c00000 len 0x8fc00000 pid 40628 pid 40629 40629: [1625500831] touch 40629: [1625500832] sleep before madvise 40629: [1625500833] madvise 40629: [1625500834] Press enter to exit 40628: [1625500845] touch 40628: [1625500846] sleep before madvise 40628: [1625500851] madvise 40628: [1625500852] Press enter to exit And you can see that child started running in 11 seconds after parent had already called madvise() for all scope of touched memory. And finally in dmesg: pid 40629 (mmapfork), jid 0, uid 1001, was killed: out of swap space So the same result as I wrote in the first email. > Now, about the specific of madvise(MADV_FREE) on FreeBSD. Due to the way > CoW is implemented with the shadow chain of objects, we cannot drop the > top of the shadow chain, otherwise instead of returning zeroed pages next > time, we would return content back in the time. It was relatively recent > discovery, see bf5661f4a1af6931ec4b6, PR 240061. > Thanks, I will look at it. > To explain it in simplified form, when there is potential old content > under the CoW copy for the mapping, we cannot drop CoW-ed pages. This > is the motivation why madvise(MADV_FREE) does nothing for your program. > When you run two instances without fork, there is no previous content > and no Cow, so madvise() can safely remove the pages from the object, > and on the next access they are zero-filled. Do I understand right, that it should work with MADV_DONTNEED? But “dontneed" variant doesn’t work. > > You can read more details in the referenced commit, as well as some musings > about way to make it somewhat better. > > I must say, that trying to allocated 1/2 + 1/2 of RAM this way, on a system > without swap, is the way to ask for troubles anyway. I’ve just notify that other operation systems work well with that, whereas FreeBSD has troubles. Probably something in madvise() is not finished ? ---- Vitaliy Gusev