FreeBSD NFS server not responding to TCP SYN packets from
Linux/SunOS clients
Chuck Lever
cel at citi.umich.edu
Fri Oct 14 14:45:03 PDT 2005
rick at snowhite.cis.uoguelph.ca wrote:
>>where is that rule stated? most NFS clients i am aware of retransmit an
>>RPC after 60 seconds over TCP.
>
>
> For NFSv4, it's in RFC3530, Sec. 3.1.1 (actually applies to RPCs other
> than NULL).
i recently had a thorough discussion of this with the author of that
section, Mike Eisler.
> For NFSv2,3 it was never required by the RFCs, so it is
> questionable what the correct behaviour is. Being the first to do NFS over
> TCP, I only did retransmits after reconnect. I think I described it that
> way in the ancient Usenix paper. (http://snowhite.cis.uoguelph.ca/nfsv4,
> then click on it)
i will try to grab that.
> When Sun first did NFS over TCP, I believe they did
> do retries (using a conservative timeout). I think I eventually convinced Sun
> that it wasn't a good idea and I think that Solaris no longer
> does them, but I'm not sure. (For this to work correctly, a server is required
> to disconnect whenever it can't generate a reply to an RPC over TCP for any
> reason.)
yes, this is a difficult semantic.
it means that there is now a race that allows a server to redo a
non-idempotent request if the client reconnects on another port and
sends a retransmit of a stuck request. i've seen this in practice, and
for certain applications this will cause data corruption.
most Linux NFS clients will not reconnect on the same port after the
server disconnects (a bug i recently addressed). for servers with a
duplicate reply cache, this means the client can retransmit
non-idempotent requests and the DRC will not stop the requests from
being reapplied. such servers are dependent on identifying RPC requests
by the tuple of [ XID, source port, client IP ] -- if source port
changes, then the DRC is rendered ineffective.
servers that don't have a DRC for TCP are exposed to this problem. when
they disconnect the TCP connection, they've lost all stream transport
guarantees (no request reordering, no duplicate requests). on reconnect
a client can retransmit any requests it hasn't received a reply for,
which are then reapplied by the server. if the server doesn't guarantee
that these retransmitted requests are applied in the same order that the
original requests were applied, there is opportunity for data corruption.
retransmitting an idempotent request will cause a connection drop,
meaning any non-idempotents requests that were outstanding at the time
will have to be retransmitted.
this is load dependent behavior. when a server slows down, a client
that retransmits on TCP is more likely to retransmit one or more
non-idempotent requests. this means the server will disconnect,
creating even more work for server, network, and client, and it means
the likelihood of data corruption increases as load increases.
if a client *doesn't* retransmit, is there any guarantee that a
hard-mounted client can make forward progress?
> So, for NFSv2,3 I don't know of a stated "rule". I don't think it is covered
> in the NFS interoperability RFC that appeared a while back, but can't
> remember for sure.
we've been looking for a while, but haven't seen anything.
More information about the freebsd-fs
mailing list