net/nylon hangs in read(2) under FreeBSD10

Tue Mar 18 10:10:58 UTC 2014

I've found an old topic with the same problem
http://marc.info/?l=freebsd-stable&m=101835932330932

Looks like this not good behaviour is supposed normal in nylon (?).
Code is in src/atomicio.c
....
/*
 * ensure all of data on socket comes through. f==read || f==write
 */
ssize_t
atomicio(f, fd, _s, n)
        ssize_t (*f) ();
        int fd;
        void *_s;
        size_t n;
{
        char *s = _s;
        ssize_t res, pos = 0;

        while (n > pos) {
                res = (f) (fd, s + pos, n - pos);
                switch (res) {
                case -1:
                        if (errno == EINTR || errno == EAGAIN)
                                continue;
                case 0:
                        if (pos != 0)
                                return (pos);
                        return (res);
                default:
                        pos += res;
                }
        }
        return (pos);
}

What do you think, is it bug or feature? Should I report it to OpenBSD folks?

2014-03-18 11:49 GMT+04:00 Pavel Timofeev <timp87 at gmail.com>:
> Turns out, it's just enough to do
> # telnet proxy.xxx.ru 1080
> and do nothing after that for hanging nylon in read(2).
> Naive nylon's code?
>
> 2014-03-12 11:34 GMT+04:00 Pavel Timofeev <timp87 at gmail.com>:
>> Hello!
>> I used net/nylon (socks proxy server, born in openbsd) for some time
>> under FreeBSD10.0-RELEASE amd64.
>> I had no problems in testlab with that configuration.
>> But now I have problems in production with that.
>> Sometimes nylon starts to eat whole CPU.
>>
>> For example, I have proxy.xxx.ru (192.168.31.198) and client1.xxx.ru
>> (192.168.2.6).
>> Here is what I see every time.
>> Let's say that hung nylon has pid 5323
>>
>> # truss -p 5323
>> ......
>> read(6,0x7fffffffdb13,1)                         ERR#35 'Resource
>> temporarily unavailable'
>> read(6,0x7fffffffdb13,1)                         ERR#35 'Resource
>> temporarily unavailable'
>> read(6,0x7fffffffdb13,1)                         ERR#35 'Resource
>> temporarily unavailable'
>> read(6,0x7fffffffdb13,1)                         ERR#35 'Resource
>> temporarily unavailable'
>> read(6,0x7fffffffdb13,1)                         ERR#35 'Resource
>> temporarily unavailable'
>> ^C
>>
>> Nylon tries to read from FD 6 (right?) and gets errno 35. Infinite loop.
>>
>>
>> # lsof -p 5323
>> COMMAND  PID USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
>> nylon   5323 root  cwd   VDIR               0,96     1024       2 /
>> nylon   5323 root  rtd   VDIR               0,96     1024       2 /
>> nylon   5323 root  txt   VREG               0,96    34840 1134069
>> /usr/local/bin/nylon
>> nylon   5323 root  txt   VREG               0,96   111696  240770
>> /libexec/ld-elf.so.1
>> nylon   5323 root  txt   VREG               0,96   306532 2970271
>> /usr/local/lib/event2/libevent-2.0.so.6
>> nylon   5323 root  txt   VREG               0,96  1567216  481549 /lib/libc.so.7
>> nylon   5323 root  txt   VREG               0,96   105104  481569
>> /lib/libthr.so.3
>> nylon   5323 root    0u  VCHR               0,15      0t0      15 /dev/null
>> nylon   5323 root    1u  VCHR               0,15      0t0      15 /dev/null
>> nylon   5323 root    2u  VCHR               0,15      0t0      15 /dev/null
>> nylon   5323 root    4u  IPv4 0xfffff800646b9c00      0t0     TCP
>> proxy.xxx.ru:socks (LISTEN)
>> nylon   5323 root    5u  unix 0xfffff8006494d000      0t0
>> ->0xfffff800098862b8
>> nylon   5323 root    6u  IPv4 0xfffff8011cd07400      0t0     TCP
>> proxy.xxx.ru:socks->client1.xxx.ru:45737 (ESTABLISHED)
>>
>> Looks like that FD is last line in this output. It's tcp socket (right?).
>>
>>
>> # sockstat | grep 5323
>> root     nylon      5323  4  tcp4   192.168.31.198:1080   *:*
>> root     nylon      5323  5  dgram  -> /var/run/logpriv
>> root     nylon      5323  6  tcp4   192.168.31.198:1080   192.168.2.6:45737
>>
>> That pid has open socket with client1.xxx.ru (192.168.2.6).
>>
>> I looked to open sockets in client1.xxx.ru and didn't find suitable.
>>
>> And I can kill that hung pid only using "kill -9".
>>
>>
>> It appears to be a problem here. But where? In nylon or even in FreeBSD?
>> I'm not UNIX OS and programming professional and I don't know wheater
>> OS has to return errno 35 for read of that dead(?) socket.
>> Do I have to provide more info? Which one? I'm looking forward!
>> It happens quite often now.