cvs commit: src/sys/sys param.h src/include Makefile netdb.h
res_update.h resolv.h src/include/arpa inet.h nameser.h nameser
Robert Watson
rwatson at FreeBSD.org
Thu Aug 3 09:43:22 UTC 2006
On Thu, 3 Aug 2006, Helge Oldach wrote:
> Well... I've spotted a regression not with the ports tree but with 6-STABLE.
> On several boxes with this change applied I see lots of sendmails stacking
> up over time, for example:
>
> 713 ?? Ss 0:01.05 sendmail: accepting connections (sendmail)
> 717 ?? Is 0:00.02 sendmail: Queue runner at 00:30:00 for /var/spool/client
> 31747 ?? I 0:00.00 sendmail: startup with 71.119.31.81 (sendmail)
> 32834 ?? I 0:00.00 sendmail: startup with 83.36.190.38 (sendmail)
> 33569 ?? I 0:00.00 sendmail: startup with 221.206.76.60 (sendmail)
> 34023 ?? I 0:00.00 sendmail: startup with 49.195.192.61.tokyo.flets.alph
> 34459 ?? I 0:00.00 sendmail: startup with 221.165.35.46 (sendmail)
> 36517 ?? I 0:00.00 sendmail: startup with 61.192.180.137 (sendmail)
> 38722 ?? I 0:00.00 sendmail: startup with 203.177.238.78 (sendmail)
> 39126 ?? I 0:00.00 sendmail: startup with 222.90.251.185 (sendmail)
> 39203 ?? I 0:00.00 sendmail: startup with 221.9.214.183 (sendmail)
> 39859 ?? I 0:00.00 sendmail: startup with 59.20.101.111 (sendmail)
> 41090 ?? I 0:00.00 sendmail: startup with 61.192.166.235 (sendmail)
> 41766 ?? I 0:00.00 sendmail: startup with 68.118.52.132 (sendmail)
> 42482 ?? I 0:00.00 sendmail: startup with 219.249.201.36 (sendmail)
> 42483 ?? I 0:00.00 sendmail: startup with 219.249.201.36 (sendmail)
> 43467 ?? I 0:00.00 sendmail: startup with 210.213.191.70 (sendmail)
> 43757 ?? I 0:00.00 sendmail: startup with 220.189.144.7 (sendmail)
> 44176 ?? I 0:00.00 sendmail: startup with 71.205.226.98 (sendmail)
> 44850 ?? I 0:00.00 sendmail: startup with 72.89.135.133 (sendmail)
> 44943 ?? I 0:00.00 sendmail: startup with 220.167.134.212 (sendmail)
> 48031 ?? I 0:00.00 sendmail: startup with 60.22.198.23 (sendmail)
>
> On one busy sendmail box I've seen literally thousands of such processes.
> Note that these processes don't disappear, so it is not related to
> sendmail.cf's timeouts.
>
> Broswing through the recent STABLE commits, I firstly thought it was related
> to the recent socket code changes, but no, it's not. It is definitely this
> introduction of BIND9's resolver. If I back out this change, all is fine
> again.
>
> As said, this is a very recent 6-STABLE. I'm tracking CTM, not cvs.
>
> I would seriously suggest to more thoroughly test this. I'm not asking to
> back it out right now, but this is definitely a breakage in 6-STABLE that
> should be fixed before 6.2.
I've had a similar report from Bjoern Zeeb; at first we thought the reason he
had stacking up TCP connections was a bug I introduced in 7.x, but it turns
out it's because his sshd is wedging in name resolution, and not closing the
TCP sockets (which are now visible in netstat in a way they weren't before).
We only concluded that it was not a kernel socket bug a day or so ago, so I'm
not sure he's had a chance to generate a resolver bug report. He reported
that the application appeared to have two connected UDP sockets for name
resolution, and one bad name server entry, but that the resolver appeared to
be blocked in a read on the UDP socket that didn't have data queued, rather
than the one that did. This was all from looking at netstat, and as far as I
know, he's not dug into the resolver yet to see what might be happening. I've
CC'd Bjoern in case he has further insight or can offer some more suggestions
on what might be going on.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the cvs-src
mailing list