TFO for NFS

Sat Aug 29 02:10:52 UTC 2020

Scheffenegger, Richard wrote:
>I know, NFS TCP sessions are some of the most long-lived sessions in regular use.
Ok, so I'll admit I can't wrap my head around this.
It is way out of my area of expertise (so I've added freebsd-net@ to the cc), but
it seems to me that NFS is the about the least appropriate use fot TFO.

It seems that, for TFO to be useful, the application needs to be doing frequent
short lived TCP connections and often across WAN/Internet.
NFS mounts do neither of the above.
- They, as we've noted, only normally do a TCP connect at mount time.
   Usually run on low latency LAN environments. (High latency connections
   hammer NFS performance, due to its frequent small RPCs that the client
   must wait for replies to sychronously.)

All you might save is one RTT. Take a look at how many RPCs (each with a RTT)
happen on an active NFS mount.

>My rationale is two-fold:
>
>First, having a relatively high-profile use of the TFO option in the core OS modules >will definitely expose that feature to at least some use.
Well, I don't think it is NFS's job to expose a feature that is not useful for it.
(If you were to implement this and benchmarking showed a significant
 improvement in elapsed time to do an NFS mount, then that could be a
different story.)

>Second, in case of a network disconnect (or, something with my company does, >that would be most comparable to unassigning and reassigning the server IP >address between different physical ports), while there is IO load, TFO may reduce >(ever so slightly) the latency impact of the enqueued IOs.
I'm not sure I understand this. NFS always uses port# 2049.
If you are referring to the host IP address, then wouldn't that be handled via.
Arp and routing? (Does this require a fresh TCP connection to the same server
IP address?)

>My plan is first to simply enable the socket option - that should result in TFO to >get negotiated for, but no actual latency improvement, while the traditional >connect() sequence to set up a TCP session is done., from the client side; the >server side will not need to change, and can send out initial data right away with >the syn/ack (at least in theory, if the syn contained a full NFS request that can be >responded to).
>
>Changing the client to make use of the SYN+data facilities would be a 2nd step.
Well, during an NFS mount, there is first a TCP connection made by
mount_nfs im userspace and it is only used for a single Null RPC.
--> This checks that the server is up and running.
Then mount_nfs does nmount(2), which will create a second TCP connection
which is normally used until unmount.
--> All you save is the RTT for the one first RPC of many.

>Also, I shall make this a configurable, since some network devices may inhibit TFO >packets, incurring a delay (but that's mostly public internet, not private networks >where NFS is being used). Ideally with TFO default to on (once it's working >properly), but able to explicitly disable it for certain mounts.
NFS suffered TSO related bugs for several (> 5) years (and I wouldn't be surprised
if there are still net device drivers broken such that TSO must be disabled to make
NFS work ok on them.

As such, I get very nervous about this kind of thing.

Reliability always trumps performance when it comes to file system work.

Now, if you are interested in improving NFS performance over TCP, that
could be a very interesting project, but I doubt TFO would be relevant.
Especially when you look at long fat pipes (TCP connections with a large
delay * bandwidth), there is probably a lot that could be done.
--> Read-ahead, write-back algorithm changes. Read/Write data size.
       Throttling/congestion avoidance/window sizing in TCP.
       And the list goes on and on...

I do hope that NFS over TLS allows more use of NFS across the Internet,
so performance work related to NFS running on WAN/Internet connections
would be a great thing to do. (I'm not conversant with the current TCP stack,
so I'm not the guy to tackle this.)

rick

Richard Scheffenegger

-----Original Message-----
From: Rick Macklem <rmacklem at uoguelph.ca>
Sent: Freitag, 28. August 2020 04:35
To: Scheffenegger, Richard <Richard.Scheffenegger at netapp.com>; rmacklem at freebsd.org
Cc: Michael Tuexen <tuexen at freebsd.org>
Subject: Re: TFO for NFS

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Well, you'll find the soconnect() stuff in sys/rpc/clnt_vc.c.
If you just want to play around with it, have fun.

As for this being useful in practice, that seems unlikely.
When the kernel RPC code uses TCP it establishes one TCP connection at mount time and uses that connection until unmount unless the connection breaks somehow.
(A server will often disconnect after about 5 minutes of  no activity on the connection. This almost never happens  for NFSv4, since the NFSv4 client does an RPC every 30sec  to maintain the lease against the server.)
--> A new TCP connection usually only happens after a
      network partitioning heals.
(There was a bug that caused reconnects during certain  cases of signal handling, but that was fixed about 3 years ago.)

rick

________________________________________
From: Scheffenegger, Richard <Richard.Scheffenegger at netapp.com>
Sent: Thursday, August 27, 2020 6:29 PM
To: rmacklem at freebsd.org
Cc: Michael Tuexen
Subject: TFO for NFS

CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp at uoguelph.ca

Hi Rick,

I've seen you are very active with the fbsd nfs code, having branched the nfs-over-tls project.

Is anyone else contributing to this project yet?

After some discussion in todays freebsd-transport call with tuexen@ , I was wondering if the TCP Fast Open Option could be added as a proof-of-concept to the in-kernel RPC handler. It may also be a nice augmentation of nfs-over-tls when available, to absorb some of the added tls connection setup latency when available...

Right now, I am quite unfamiliar with all the rpc code, which appears to handle all the basic plumbing of NFS;

Would you be interested in helping me with advice and reviews, in order to try and get something around TFO working?

(The reduction in time-to-first-IO by 1 RTT may be helpful in some scenarios, or when TLS 1.2 instead of 1.3 is in use, where speeding up the tls handshake would potentially also be a nice property.

Having said all this, for a client to actually make use of TFO, it is likely the slight changes / additions need to be done, in order to send out the initial data (TLS or RPC) right away before any soconnect(), using sendmsg() instead - causing the socket itself to figure out that tcp can connect and send data at the same time...

Best regards,

Richard Scheffenegger