sshd with zombie process on FreeBSD 10.0-STABLE - workaround

Marcelo Gondim gondim at bsdinfo.com.br
Thu Mar 20 23:46:59 UTC 2014


Em 20/03/14 11:58, John Baldwin escreveu:
> On Wednesday, March 19, 2014 1:47:10 pm Marcelo Gondim wrote:
>> Em 19/03/14 13:01, Kevin Oberman escreveu:
>>> On Wed, Mar 19, 2014 at 6:00 AM, Marcelo Gondim
> <gondim at bsdinfo.com.br>wrote:
>>>> Hi all,
>>>>
>>>> While the solution does not appear, did the script below and put it in
>>>> crontab to automatically delete zombie sshd processes.
>>>>
>>>> the_walking_dead.sh:
>>>>
>>>> #!/bin/sh
>>>> kill -9 `ps afx|grep sshd|grep unknown|awk '{print $1}'`
>>>>
>>>>
>>>> Put this in /etc/crontab:
>>>>
>>>> 00 1 * * *    root    the_walking_dead.sh
>>>>
>>>>
>>> If 'kill -9' works, the process is not really a zombie. It simply still
> has
>>> a socket open and is waiting for it to be closed before exiting.
>>>
>>> You might takes a look at network sockets with sockstat(1) and see if you
>>> can get any indication of why these sockets are not being closed. It may
> be
>>> that the issue is not sshd but some other issue in the OS leaving sockets
>>> open.
>>>
>> Hi Kevin,
>>
>> My ps -afx below:
>>
>> [...]
>> 42139  -  Is       0:00.01 sshd: unknown [priv] (sshd)
>> 42140  -  Z        0:00.01 <defunct>
>> 42141  -  IW       0:00.00 sshd: unknown [pam] (sshd)
>> 58445  -  Is       0:00.01 sshd: unknown [priv] (sshd)
>> 58446  -  Z        0:00.02 <defunct>
>> 58447  -  IW       0:00.00 sshd: unknown [pam] (sshd)
>> 65635  -  Is       0:00.01 sshd: vinicius [priv] (sshd)
>> 65636  -  Z        0:00.01 <defunct>
>> [...]
>>
>> # sockstat | grep 42140
>> #
>>
>> # sockstat | grep 58446
>> #
>>
>> # sockstat | grep 65636
>> #
>>
>> No associated socket with zombie process.
> Do a pstree.  I bet the zombies are children of the other processes that
> are stuck on a socket as Kevin described.
>
# ps afx|grep sshd |grep unk
10948  -  Is       0:00.02 sshd: unknown [priv] (sshd)
10955  -  IW       0:00.00 sshd: unknown [pam] (sshd)       <====
11701  -  Is       0:00.02 sshd: unknown [priv] (sshd)
11704  -  IW       0:00.00 sshd: unknown [pam] (sshd)
25450  -  Is       0:00.01 sshd: unknown [priv] (sshd)
25452  -  IW       0:00.00 sshd: unknown [pam] (sshd)
41193  -  Is       0:00.02 sshd: unknown [priv] (sshd)
41196  -  IW       0:00.00 sshd: unknown [pam] (sshd)
42193  -  Is       0:00.02 sshd: unknown [priv] (sshd)
42195  -  IW       0:00.00 sshd: unknown [pam] (sshd)
80638  -  Is       0:00.02 sshd: unknown [priv] (sshd)
80640  -  IW       0:00.00 sshd: unknown [pam] (sshd)
81484  -  Is       0:00.02 sshd: unknown [priv] (sshd)
81486  -  IW       0:00.00 sshd: unknown [pam] (sshd)

With proctstat I could see  the socket as follows:

# procstat -f 10955
   PID COMM               FD T V FLAGS     REF  OFFSET PRO NAME
10955 sshd              text v r r-------  -       - - /usr/sbin/sshd
10955 sshd               cwd v d r-------  -       - - /
10955 sshd              root v d r-------  -       - - /
10955 sshd                 0 v c rw------  6       0 - /dev/null
10955 sshd                 1 v c rw------  6       0 - /dev/null
10955 sshd                 2 v c rw------  6       0 - /dev/null
10955 sshd                 3 s - rw---n--  2       0 TCP 186.xxx.xx.2:22 
186.xxx.xx.8:57035
10955 sshd                 5 p - rw------  2       0 - -
10955 sshd                 6 s - rw------  2       0 UDS -
10955 sshd                 7 p - rw------  1       0 - -
10955 sshd                 8 s - rw------  2       0 UDS -

I do not understand why these connections are remaining locked in 
FreeBSD 10.0

I'll try this sysctl: net.inet.tcp.delayed_ack=0




More information about the freebsd-stable mailing list