Socket leak (Was: Re: What triggers "No Buffer Space) Available"?

Robert Watson rwatson at FreeBSD.org
Fri May 4 11:05:12 UTC 2007


On Thu, 3 May 2007, Marc G. Fournier wrote:

> I'm trying to probe this as well as I can, but network stacks and sockets 
> have never been my strong suit ...
>
> Robert had mentioned in one of his emails about a "Sockets can also exist 
> without any referencing process (if the application closes, but there is 
> still data draining on an open socket)."
>
> Now, that makes sense to me, I can understand that ... but, how would that 
> look as far as netstat -nA shows?  Or, would it?  For example, I have:
>
> mars# netstat -nA | grep c9655a20
> c9655a20 stream      0      0        0 c95d63f0        0        0
> c95d63f0 stream      0      0        0 c9655a20        0        0
> mars# netstat -nA | grep c95d63f0
> c9655a20 stream      0      0        0 c95d63f0        0        0
> c95d63f0 stream      0      0        0 c9655a20        0        0
>
> They are attached to each other, but there appears to be no 'referencing 
> process' ... it is now 10pm at night ... I saved a 'snapshot' of netstat -nA 
> output at 6:45pm, over 3 hours ago, and it has the same entries as above:
>
> c9655a20 stream      0      0        0 c95d63f0        0        0
> c95d63f0 stream      0      0        0 c9655a20        0        0
>
> again, if I'm reading this right, there is no 'referencing process' ... 
> first, of course, am I reading this right?
>
> second ... if I am reading this right, and, if I am understanding what 
> Robert was saying about 'draining' (alot of ifs, I know) ... isn't it odd 
> for it to take >3 hours to drain?
>
> Again, if I'm reading / understanding things right, without the 'referencing 
> process', it won't show up in sockstat -u, which is why my netstat -nA 
> numbers keep growing, but sockstat -u numbers don't ... which also means 
> that there is no way to figure out what process / program is leaving 
> 'dangling sockets'? :(

I think we should be careful to avoid prematurely drawing conclusions about 
the source of the problem.  First question: have you confirmed that the 
resource limit on sockets is definitely what is causing the error you're 
seeing?  I.e., does the number of sockets hit the maximum sockets?

Second point: there are two kinds of resource leaks that seem likely 
candidates for a socket resource exhaustion problem. First, kernel bugs, in 
which the kernel maintains objects despite there being no application 
references, and second, application reference leaks, in which applications 
keep references to kernel objects despite no longer needing them.  Our 
immediate goal is to determine which of these is the case: is it a kernel bug, 
or an application bug?  Using tools like netstat and sockstat, we can try and 
determine if all kernel sockets are properly referenced.  Experience suggests 
that it is an application bug, but we shouldn't rule out a kernel bug; the 
good news is that the tools to use in the debugging process are identical at 
this stage.

Robert N M Watson
Computer Laboratory
University of Cambridge


More information about the freebsd-stable mailing list