Socket leak (Was: Re: What triggers "No Buffer Space)
Available"?
Robert Watson
rwatson at FreeBSD.org
Fri May 4 11:05:12 UTC 2007
On Thu, 3 May 2007, Marc G. Fournier wrote:
> I'm trying to probe this as well as I can, but network stacks and sockets
> have never been my strong suit ...
>
> Robert had mentioned in one of his emails about a "Sockets can also exist
> without any referencing process (if the application closes, but there is
> still data draining on an open socket)."
>
> Now, that makes sense to me, I can understand that ... but, how would that
> look as far as netstat -nA shows? Or, would it? For example, I have:
>
> mars# netstat -nA | grep c9655a20
> c9655a20 stream 0 0 0 c95d63f0 0 0
> c95d63f0 stream 0 0 0 c9655a20 0 0
> mars# netstat -nA | grep c95d63f0
> c9655a20 stream 0 0 0 c95d63f0 0 0
> c95d63f0 stream 0 0 0 c9655a20 0 0
>
> They are attached to each other, but there appears to be no 'referencing
> process' ... it is now 10pm at night ... I saved a 'snapshot' of netstat -nA
> output at 6:45pm, over 3 hours ago, and it has the same entries as above:
>
> c9655a20 stream 0 0 0 c95d63f0 0 0
> c95d63f0 stream 0 0 0 c9655a20 0 0
>
> again, if I'm reading this right, there is no 'referencing process' ...
> first, of course, am I reading this right?
>
> second ... if I am reading this right, and, if I am understanding what
> Robert was saying about 'draining' (alot of ifs, I know) ... isn't it odd
> for it to take >3 hours to drain?
>
> Again, if I'm reading / understanding things right, without the 'referencing
> process', it won't show up in sockstat -u, which is why my netstat -nA
> numbers keep growing, but sockstat -u numbers don't ... which also means
> that there is no way to figure out what process / program is leaving
> 'dangling sockets'? :(
I think we should be careful to avoid prematurely drawing conclusions about
the source of the problem. First question: have you confirmed that the
resource limit on sockets is definitely what is causing the error you're
seeing? I.e., does the number of sockets hit the maximum sockets?
Second point: there are two kinds of resource leaks that seem likely
candidates for a socket resource exhaustion problem. First, kernel bugs, in
which the kernel maintains objects despite there being no application
references, and second, application reference leaks, in which applications
keep references to kernel objects despite no longer needing them. Our
immediate goal is to determine which of these is the case: is it a kernel bug,
or an application bug? Using tools like netstat and sockstat, we can try and
determine if all kernel sockets are properly referenced. Experience suggests
that it is an application bug, but we shouldn't rule out a kernel bug; the
good news is that the tools to use in the debugging process are identical at
this stage.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-stable
mailing list