Why kernel kills processes that run out of memory instead of
just failing memory allocation system calls?
Nate Eldredge
neldredge at math.ucsd.edu
Thu May 21 21:37:22 UTC 2009
On Thu, 21 May 2009, Yuri wrote:
> Nate Eldredge wrote:
>> Suppose we run this program on a machine with just over 1 GB of memory. The
>> fork() should give the child a private "copy" of the 1 GB buffer, by
>> setting it to copy-on-write. In principle, after the fork(), the child
>> might want to rewrite the buffer, which would require an additional 1GB to
>> be available for the child's copy. So under a conservative allocation
>> policy, the kernel would have to reserve that extra 1 GB at the time of the
>> fork(). Since it can't do that on our hypothetical 1+ GB machine, the
>> fork() must fail, and the program won't work.
>
> I don't have strong opinion for or against "memory overcommit". But I can
> imagine one could argue that fork with intent of exec is a faulty scenario
> that is a relict from the past. It can be replaced by some atomic method that
> would spawn the child without ovecommitting.
I would say rather it's a centerpiece of Unix design, with an unfortunate
consequence. Actually, historically this would have been much more of a
problem than at present, since early Unix systems had much less memory, no
copy-on-write, and no virtual memory (this came in with BSD, it appears;
it's before my time.)
The modern "atomic" method we have these days is posix_spawn, which has a
pretty complicated interface if you want to use pipes or anything. It
exists mostly for the benefit of systems whose hardware is too primitive
to be able to fork() in a reasonable manner. The old way to avoid the
problem of needing this extra memory temporarily was to use vfork(),
but this has always been a hack with a number of problems. IMHO neither
of these is preferable in principle to fork/exec.
Note another good example is a large process that forks, but the child
rather than exec'ing performs some simple task that writes to very little
of its "copied" address space. Apache does this, as Bernd mentioned.
This also is greatly helped by having overcommit, but can't be
circumvented by replacing fork() with something else. If it really
doesn't need to modify any of its shared address space, a thread can
sometimes be used instead of a forked subprocess, but this has issues of
its own.
Of course all these problems are solved, under any policy, by having more
memory or swap. But overcommit allows you to do more with less.
> Are there any other than fork (and mmap/sbrk) situations that would
> overcommit?
Perhaps, but I can't think of good examples offhand.
--
Nate Eldredge
neldredge at math.ucsd.edu
More information about the freebsd-hackers
mailing list