FreeBSD NFS server not responding to TCP SYN packets from
Linux/SunOS clients
rick at snowhite.cis.uoguelph.ca
rick at snowhite.cis.uoguelph.ca
Fri Oct 14 09:54:27 PDT 2005
>> rick
>> ps: It would be nice if someone with the right expertise could explore
>> other things in TCP specifically for NFS. For example, I don't see
>> why a retransmit timeout should go above about 100msec, since net
>> delays are well below that level, even half way around the world
>> these days. Having said that, I don't know enough about TCP retransmit
>> to say that one second retry intervals aren't correct?
>
>Wouldn't this be a problem for a server under high disk load? If the
>disks are very very busy, and clients are requesting stat's on files,
>etc, then the server would be waiting on disk, and the time could be way
>more than 100ms, even more than 1s. Of course, this would be a slow
>server because of the load, however it does occur, and so lowering it to
>100msec might be too aggresive. If you have many many clients, all
>attempting lots of NFS activity, during times of load you could make the
>server even more overloaded with all the retransmits, right?
It is a concern. If the previously sent request is still in the server's
TCP socket receive queue, then TCP will throw away the retransmit. If the
request is in progress via an nfsd thread, then the recent request cache code
should wait for the reply created from the first one and then both
requests get copies of the reply. (This introduces overhead, but at
least no additional disk I/O or risk of repeating a non-idempotent request.)
nb: My current server cache code does this, but I don't believe the one
currently in FreeBSD does?
The trick is to not have the nfsd threads remove a request from the socket
receive queue until the disk subsystem isn't backlogged. Since delayed ACK
is disabled for NFS over TCP, the server will then throw away the retransitted
request and generate an ACK to the client right away, so the TCP layer in
the client won't retransmit it again.
The problem is "how do you make sure the nfsd threads don't start a
request if the disk I/O subsystem is backlogged". An interesting question
and I'd appreciate hearing suggestions. Part of the problem is that many
requests can be satified out of caches in the server (such as the vnode/inode
in memory, for a Getattr) and the server doesn't know if a request will
be doing disk I/O (it's hidden behind the VFS/Vnode layer).
One possibility is for nfsd threads to time how long they take to do
a request. When the thread sees that time increasing dramatically, it
could assume a backlog in the disk I/O subsystem and sleep for a while
before getting the next request off a socket receive queue. Sounds like
something worth looking at. Unfortunately I think it will require a
pretty high resolution time clock and I don't think I can count on that
in FreeBSD unless the server has the right hardware?
Any other ideas? rick
More information about the freebsd-fs
mailing list