FreeBSD NFS server not responding to TCP SYN packets from Linux/SunOS clients

Fri Oct 14 09:06:01 PDT 2005

As others have noted, the problem is that the connection is still
established on the server and the client re-uses the same port#.

Due to the NDA, I can't tell you the details, but there was an amusing
case at the last NFSv4 Bakeathon which I'll call the "Psychic Server Theory".

Basically, my server had a bug (that was fixed) where it would generate
a Readdir reply larger than requested for a certain case. A client would
crash when it saw this reply. The interesting part was that it would
crash again as soon as it was rebooted, before it even attempted a remount.

The theory was that my "Evil Psychic Server" KNEW that the client might
try to mount it and would crash it "somehow" before the mount took place.
(What was really happening was that the server had the TCP connection
 still established and would resend the reply to the same port#, which
 this client already had configured for its first mount.)

>From an NFS point of view, I can't see a security risk w.r.t. the server
breaking the TCP connection more quickly. (I don't know what implications
this change might have w.r.t. TCP level denial of service attacks, etc.)

What it will do is increase the risk of data corruption on the server, since
a TCP reconnect on the client implies it must retry all outstanding requests,
including non-idempotent ones. In the generic FreeBSD NFS server, the recent
request cache is normally disabled for TCP, so a retry of a non-idempotent
RPC can/will corrupt the server file system (from the point of view of
what the client expects to have on the file system).

My new server does have a redesigned recent request cache that would
minimize this risk, although there will always be a worst case scenario
where problems could still occur.

In summary, unless there is an increased risk of a "denial of service"
attack at the TCP level, it would be nice if the nfsd threads could tell
TCP that connections can be broken down fairly quickly for unresponsive
clients. This assumes a recent request cache that works well for TCP.
(Related to this, having "keep alives" done more frequently should detect
 the rebooted client, since the connection doesn't exist at the other end?)

rick
ps: It would be nice if someone with the right expertise could explore
    other things in TCP specifically for NFS. For example, I don't see
    why a retransmit timeout should go above about 100msec, since net
    delays are well below that level, even half way around the world
    these days. Having said that, I don't know enough about TCP retransmit
    to say that one second retry intervals aren't correct?