Why kernel kills processes that run out of memory instead of
just failing memory allocation system calls?
Nate Eldredge
neldredge at math.ucsd.edu
Thu May 21 07:15:24 UTC 2009
On Wed, 20 May 2009, Yuri wrote:
> Seems like failing system calls (mmap and sbrk) that allocate memory is more
> graceful and would allow the program to at least issue the reasonable error
> message.
> And more intelligent programs would be able to reduce used memory instead of
> just dying.
It's a feature, called "memory overcommit". It has a variety of pros and
cons, and is somewhat controversial. One advantage is that programs often
allocate memory (in various ways) that they will never use, which under a
conservative policy would result in that memory being wasted, or programs
failing unnecessarily. With overcommit, you sometimes allocate more
memory than you have, on the assumption that some of it will not actually
be needed.
Although memory allocated by mmap and sbrk usually does get used in fairly
short order, there are other ways of allocating memory that are easy to
overlook, and which may "allocate" memory that you don't actually intend
to use. Probably the best example is fork().
For instance, consider the following program.
#define SIZE 1000000000 /* 1 GB */
int main(void) {
char *buf = malloc(SIZE); /* 1 GB */
memset(buf, 'x', SIZE); /* touch the buffer */
pid_t pid = fork();
if (pid == 0) {
execlp("true", "true", (char *)NULL);
perror("true");
_exit(1);
} else if (pid > 0) {
for (;;); /* do work */
} else {
perror("fork");
exit(1);
}
return 0;
}
Suppose we run this program on a machine with just over 1 GB of memory.
The fork() should give the child a private "copy" of the 1 GB buffer, by
setting it to copy-on-write. In principle, after the fork(), the child
might want to rewrite the buffer, which would require an additional 1GB to
be available for the child's copy. So under a conservative allocation
policy, the kernel would have to reserve that extra 1 GB at the time of
the fork(). Since it can't do that on our hypothetical 1+ GB machine, the
fork() must fail, and the program won't work.
However, in fact that memory is not going to be used, because the child is
going to exec() right away, which will free the child's "copy". Indeed,
this happens most of the time with fork() (but of course the kernel can't
know when it will or won't.) With overcommit, we pretend to give the
child a writable private copy of the buffer, in hopes that it won't
actually use more of it than we can fulfill with physical memory. If it
doesn't use it, all is well; if it does use it, then disaster occurs and
we have to start killing things.
So the advantage is you can run programs like the one above on machines
that technically don't have enough memory to do so. The disadvantage, of
course, is that if someone calls the bluff, then we kill random processes.
However, this is not all that much worse than failing allocations:
although programs can in theory handle failed allocations and respond
accordingly, in practice they don't do so and just quit anyway. So in
real life, both cases result in disaster when memory "runs out"; with
overcommit, the disaster is a little less predictable but happens much
less often.
If you google for "memory overcommit" you will see lots of opinions and
debate about this feature on various operating systems.
There may be a way to enable the conservative behavior; I know Linux has
an option to do this, but am not sure about FreeBSD. This might be useful
if you are paranoid, or run programs that you know will gracefully handle
running out of memory. IMHO for general use it is better to have
overcommit, but I know there are those who disagree.
--
Nate Eldredge
neldredge at math.ucsd.edu
More information about the freebsd-hackers
mailing list