Terrible NFS performance under 9.2-RELEASE?

Tue Jan 21 02:01:14 UTC 2014

Since this is getting long winded, I'm going to "cheat" and top post.
(Don't top post flame suit on;-)

You could try setting
net.inet.tcp.delayed_ack=0
via sysctl.
I just looked and it appears that TCP delays ACKs for a while, even
when TCP_NODELAY is set (I didn't know that). I honestly don't know
how much/if any effect these delayed ACKs will have, but is you
disable them, you can see what happens.

rick

> MIME-Version: 1.0
> Sender: jdavidlists at gmail.com
> Received: by 10.42.170.8 with HTTP; Sun, 19 Jan 2014 20:08:04 -0800
> (PST)
> In-Reply-To:
> <1349281953.12559529.1390174577569.JavaMail.root at uoguelph.ca>
> References: <52DC1241.7010004 at egr.msu.edu>
> 	<1349281953.12559529.1390174577569.JavaMail.root at uoguelph.ca>
> Date: Sun, 19 Jan 2014 23:08:04 -0500
> Delivered-To: jdavidlists at gmail.com
> X-Google-Sender-Auth: 2XgnsPkoaEEkfTqW1ZVFM_Lel3o
> Message-ID:
> <CABXB=RQDpva-fiMJDiRX_TZhkoQ9kZtk6n3i6=pw1z6cad_1KQ at mail.gmail.com>
> Subject: Re: Terrible NFS performance under 9.2-RELEASE?
> From: J David <j.david.lists at gmail.com>
> To: Rick Macklem <rmacklem at uoguelph.ca>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> On Sun, Jan 19, 2014 at 9:32 AM, Alfred Perlstein
> <alfred at freebsd.org> wrote:
> > I hit nearly the same problem and raising the mbufs worked for me.
> >
> > I'd suggest raising that and retrying.
> 
> That doesn't seem to be an issue here; mbufs are well below max on
> both client and server and all the "delayed"/"denied" lines are
> 0/0/0.
> 
> 
> On Sun, Jan 19, 2014 at 12:58 PM, Adam McDougall
> <mcdouga9 at egr.msu.edu> wrote:
> > Also try rsize=32768,wsize=32768 in your mount options, made a huge
> > difference for me.
> 
> This does make a difference, but inconsistently.
> 
> In order to test this further, I created a Debian guest on the same
> host as these two FreeBSD hosts and re-ran the tests with it acting
> as
> both client and server, and ran them for both 32k and 64k.
> 
> Findings:
> 
> 
>                                   random  random
> write rewrite    read    reread    read   write
> 
> S:FBSD,C:FBSD,Z:64k
> 67246    2923   103295  1272407  172475   196
> 
> S:FBSD,C:FBSD,Z:32k
> 11951   99896   223787  1051948  223276   13686
> 
> S:FBSD,C:DEB,Z:64k
> 11414   14445    31554    30156   30368   13799
> 
> S:FBSD,C:DEB,Z:32k
> 11215   14442    31439    31026   29608   13769
> 
> S:DEB,C:FBSD,Z:64k
> 36844  173312   313919  1169426  188432   14273
> 
> S:DEB,C:FBSD,Z:32k
> 66928  120660   257830  1048309  225807   18103
> 
> So the rsize/wsize makes a difference between two FreeBSD nodes, but
> with a Debian node as either client or server, it no longer seems to
> matter much.  And /proc/mounts on the debian box confirms that it
> negotiates and honors the 64k size as a client.
> 
> On Sun, Jan 19, 2014 at 6:36 PM, Rick Macklem <rmacklem at uoguelph.ca>
> wrote:
> > Yes, it shouldn't make a big difference but it sometimes does. When
> > it
> > does, I believe that indicates there is a problem with your network
> > fabric.
> 
> Given that this is an entirely virtual environment, if your belief is
> correct, where would supporting evidence be found?
> 
> As far as I can tell, there are no interface errors reported on the
> host (checking both taps and the bridge) or any of the guests,
> nothing
> in sysctl dev.vtnet of concern, etc.  Also the improvement from using
> debian on either side, even with 64k sizes, seems counterintuitive.
> 
> To try to help vindicate the network stack, I did iperf -d between
> the
> two FreeBSD nodes while the iozone was running:
> 
> Server:
> 
> $ iperf -s
> 
> ------------------------------------------------------------
> 
> Server listening on TCP port 5001
> 
> TCP window size: 1.00 MByte (default)
> 
> ------------------------------------------------------------
> 
> [  4] local 172.20.20.162 port 5001 connected with 172.20.20.169 port
> 37449
> 
> ------------------------------------------------------------
> 
> Client connecting to 172.20.20.169, TCP port 5001
> 
> TCP window size: 1.00 MByte (default)
> 
> ------------------------------------------------------------
> 
> [  6] local 172.20.20.162 port 28634 connected with 172.20.20.169
> port 5001
> 
> Waiting for server threads to complete. Interrupt again to force
> quit.
> 
> [ ID] Interval       Transfer     Bandwidth
> 
> [  6]  0.0-10.0 sec  15.8 GBytes  13.6 Gbits/sec
> 
> [  4]  0.0-10.0 sec  15.6 GBytes  13.4 Gbits/sec
> 
> 
> Client:
> 
> $ iperf -c 172.20.20.162  -d
> 
> ------------------------------------------------------------
> 
> Server listening on TCP port 5001
> 
> TCP window size: 1.00 MByte (default)
> 
> ------------------------------------------------------------
> 
> ------------------------------------------------------------
> 
> Client connecting to 172.20.20.162, TCP port 5001
> 
> TCP window size: 1.00 MByte (default)
> 
> ------------------------------------------------------------
> 
> [  5] local 172.20.20.169 port 32533 connected with 172.20.20.162
> port 5001
> 
> [  4] local 172.20.20.169 port 5001 connected with 172.20.20.162 port
> 36617
> 
> [ ID] Interval       Transfer     Bandwidth
> 
> [  5]  0.0-10.0 sec  15.6 GBytes  13.4 Gbits/sec
> 
> [  4]  0.0-10.0 sec  15.5 GBytes  13.3 Gbits/sec
> 
> 
> mbuf usage is pretty low.
> 
> Server:
> 
> $ netstat -m
> 
> 545/4075/4620 mbufs in use (current/cache/total)
> 
> 535/1819/2354/131072 mbuf clusters in use (current/cache/total/max)
> 
> 535/1641 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> 
> 0/2034/2034/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 
> 1206K/12792K/13999K bytes allocated to network (current/cache/total)
> 
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 
> 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> 
> 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> 
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 
> 0/0/0 sfbufs in use (current/peak/max)
> 
> 0 requests for sfbufs denied
> 
> 0 requests for sfbufs delayed
> 
> 0 requests for I/O initiated by sendfile
> 
> 0 calls to protocol drain routines
> 
> 
> Client:
> 
> $ netstat -m
> 
> 1841/3544/5385 mbufs in use (current/cache/total)
> 
> 1172/1198/2370/32768 mbuf clusters in use (current/cache/total/max)
> 
> 512/896 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> 
> 0/2314/2314/16384 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 
> 0/0/0/8192 9k jumbo clusters in use (current/cache/total/max)
> 
> 0/0/0/4096 16k jumbo clusters in use (current/cache/total/max)
> 
> 2804K/12538K/15342K bytes allocated to network (current/cache/total)
> 
> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 
> 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
> 
> 0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
> 
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 
> 0/0/0 sfbufs in use (current/peak/max)
> 
> 0 requests for sfbufs denied
> 
> 0 requests for sfbufs delayed
> 
> 0 requests for I/O initiated by sendfile
> 
> 0 calls to protocol drain routines
> 
> 
> 
> Here's 60 seconds of netstat -ss for ip and tcp from the server with
> 64k mount running ozone:
> 
> ip:
> 
> 4776 total packets received
> 
> 4758 packets for this host
> 
> 18 packets for unknown/unsupported protocol
> 
> 2238 packets sent from this host
> 
> tcp:
> 
> 2244 packets sent
> 
> 1427 data packets (238332 bytes)
> 
> 5 data packets (820 bytes) retransmitted
> 
> 812 ack-only packets (587 delayed)
> 
> 2235 packets received
> 
> 1428 acks (for 238368 bytes)
> 
> 2007 packets (91952792 bytes) received in-sequence
> 
> 225 out-of-order packets (325800 bytes)
> 
> 1428 segments updated rtt (of 1426 attempts)
> 
> 5 retransmit timeouts
> 
> 587 correct data packet header predictions
> 
> 225 SACK options (SACK blocks) sent
> 
> 
> And with 32k mount:
> 
> ip:
> 
> 24172 total packets received
> 
> 24167 packets for this host
> 
> 5 packets for unknown/unsupported protocol
> 
> 26130 packets sent from this host
> 
> tcp:
> 
> 26130 packets sent
> 
> 23506 data packets (5362120 bytes)
> 
> 2624 ack-only packets (454 delayed)
> 
> 21671 packets received
> 
> 18143 acks (for 5362192 bytes)
> 
> 20278 packets (756617316 bytes) received in-sequence
> 
> 96 out-of-order packets (145964 bytes)
> 
> 18143 segments updated rtt (of 17469 attempts)
> 
> 1093 correct ACK header predictions
> 
> 3449 correct data packet header predictions
> 
> 111 SACK options (SACK blocks) sent
> 
> 
> So the 32k mount sends about 6x the packet volume.  (This is on
> iozone's linear write test.)
> 
> One thing I've noticed is that when the 64k connection bogs down, it
> seems to "poison" things for awhile.  For example, iperf will start
> doing this afterward:
> 
> From the client to the server:
> 
> $ iperf -c 172.20.20.162
> 
> ------------------------------------------------------------
> 
> Client connecting to 172.20.20.162, TCP port 5001
> 
> TCP window size: 1.00 MByte (default)
> 
> ------------------------------------------------------------
> 
> [  3] local 172.20.20.169 port 14337 connected with 172.20.20.162
> port 5001
> 
> [ ID] Interval       Transfer     Bandwidth
> 
> [  3]  0.0-10.1 sec  4.88 MBytes  4.05 Mbits/sec
> 
> 
> Ouch!  That's quite a drop from 13Gbit/sec.  Weirdly, iperf to the
> debian node not affected:
> 
> From the client to the debian node:
> 
> $ iperf -c 172.20.20.166
> 
> ------------------------------------------------------------
> 
> Client connecting to 172.20.20.166, TCP port 5001
> 
> TCP window size: 1.00 MByte (default)
> 
> ------------------------------------------------------------
> 
> [  3] local 172.20.20.169 port 24376 connected with 172.20.20.166
> port 5001
> 
> [ ID] Interval       Transfer     Bandwidth
> 
> [  3]  0.0-10.0 sec  20.4 GBytes  17.5 Gbits/sec
> 
> 
> From the debian node to the server:
> 
> $ iperf -c 172.20.20.162
> 
> ------------------------------------------------------------
> 
> Client connecting to 172.20.20.162, TCP port 5001
> 
> TCP window size: 23.5 KByte (default)
> 
> ------------------------------------------------------------
> 
> [  3] local 172.20.20.166 port 43166 connected with 172.20.20.162
> port 5001
> 
> [ ID] Interval       Transfer     Bandwidth
> 
> [  3]  0.0-10.0 sec  12.9 GBytes  11.1 Gbits/sec
> 
> 
> But if I let it run for longer, it will apprently figure things out
> and creep back up to normal speed and stay there until NFS strikes
> again.  It's like the kernel is caching some sort of hint that
> connectivity to that other host sucks, and it has to either expire or
> be slowly overcome.
> 
> Client:
> 
> $ iperf -c 172.20.20.162 -t 60
> 
> ------------------------------------------------------------
> 
> Client connecting to 172.20.20.162, TCP port 5001
> 
> TCP window size: 1.00 MByte (default)
> 
> ------------------------------------------------------------
> 
> [  3] local 172.20.20.169 port 59367 connected with 172.20.20.162
> port 5001
> 
> [ ID] Interval       Transfer     Bandwidth
> 
> [  3]  0.0-60.0 sec  56.2 GBytes  8.04 Gbits/sec
> 
> 
> Server:
> 
> $ netstat -I vtnet1 -ihw 1
> 
>             input       (vtnet1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>          7     0     0        420          0     0          0     0
> 
>          7     0     0        420          0     0          0     0
> 
>          8     0     0        480          0     0          0     0
> 
>          8     0     0        480          0     0          0     0
> 
>          7     0     0        420          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>         11     0     0        12k          3     0        206     0
> <--- starts here
> 
>         17     0     0       227k         10     0        660     0
> 
>         17     0     0       408k         10     0        660     0
> 
>         17     0     0       417k         10     0        660     0
> 
>         17     0     0       425k         10     0        660     0
> 
>         17     0     0       438k         10     0        660     0
> 
>         17     0     0       444k         10     0        660     0
> 
>         16     0     0       453k         10     0        660     0
> 
>             input       (vtnet1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>         16     0     0       463k         10     0        660     0
> 
>         16     0     0       469k         10     0        660     0
> 
>         16     0     0       482k         10     0        660     0
> 
>         16     0     0       487k         10     0        660     0
> 
>         16     0     0       496k         10     0        660     0
> 
>         16     0     0       504k         10     0        660     0
> 
>         18     0     0       510k         10     0        660     0
> 
>         16     0     0       521k         10     0        660     0
> 
>         17     0     0       524k         10     0        660     0
> 
>         17     0     0       538k         10     0        660     0
> 
>         17     0     0       540k         10     0        660     0
> 
>         17     0     0       552k         10     0        660     0
> 
>         17     0     0       554k         10     0        660     0
> 
>         17     0     0       567k         10     0        660     0
> 
>         16     0     0       568k         10     0        660     0
> 
>         16     0     0       581k         10     0        660     0
> 
>         16     0     0       582k         10     0        660     0
> 
>         16     0     0       595k         10     0        660     0
> 
>         16     0     0       595k         10     0        660     0
> 
>         16     0     0       609k         10     0        660     0
> 
>         16     0     0       609k         10     0        660     0
> 
>             input       (vtnet1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>         16     0     0       620k         10     0        660     0
> 
>         16     0     0       623k         10     0        660     0
> 
>         17     0     0       632k         10     0        660     0
> 
>         17     0     0       637k         10     0        660     0
> 
>       8.7k     0     0       389M       4.4k     0       288k     0
> 
>        42k     0     0       2.1G        21k     0       1.4M     0
> 
>        41k     0     0       2.1G        20k     0       1.4M     0
> 
>        38k     0     0       1.9G        19k     0       1.2M     0
> 
>        40k     0     0       2.0G        20k     0       1.3M     0
> 
>        40k     0     0       2.0G        20k     0       1.3M     0
> 
>        40k     0     0         2G        20k     0       1.3M     0
> 
>        39k     0     0         2G        20k     0       1.3M     0
> 
>        43k     0     0       2.2G        22k     0       1.4M     0
> 
>        42k     0     0       2.2G        21k     0       1.4M     0
> 
>        39k     0     0         2G        19k     0       1.3M     0
> 
>        38k     0     0       1.9G        19k     0       1.2M     0
> 
>        42k     0     0       2.1G        21k     0       1.4M     0
> 
>        44k     0     0       2.2G        22k     0       1.4M     0
> 
>        41k     0     0       2.1G        20k     0       1.3M     0
> 
>        41k     0     0       2.1G        21k     0       1.4M     0
> 
>        40k     0     0       2.0G        20k     0       1.3M     0
> 
>             input       (vtnet1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>        43k     0     0       2.2G        22k     0       1.4M     0
> 
>        41k     0     0       2.1G        20k     0       1.3M     0
> 
>        40k     0     0       2.0G        20k     0       1.3M     0
> 
>        42k     0     0       2.2G        21k     0       1.4M     0
> 
>        39k     0     0         2G        19k     0       1.3M     0
> 
>        42k     0     0       2.1G        21k     0       1.4M     0
> 
>        40k     0     0       2.0G        20k     0       1.3M     0
> 
>        42k     0     0       2.1G        21k     0       1.4M     0
> 
>        38k     0     0         2G        19k     0       1.3M     0
> 
>        39k     0     0         2G        20k     0       1.3M     0
> 
>        45k     0     0       2.3G        23k     0       1.5M     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
> 
> It almost looks like something is limiting it to 10 packets per
> second.  So confusing!  TCP super slow start?
> 
> Thanks!
> 
> (Sorry Rick, forgot to reply all so you got an extra! :( )
> 
> Also, here's the netstat from the client side showing the 10 packets
> per second limit and eventual recovery:
> 
> $ netstat -I net1 -ihw 1
> 
>             input         (net1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>         15     0     0        962         11     0       114k     0
> 
>         17     0     0       1.1k         10     0       368k     0
> 
>         17     0     0       1.1k         10     0       411k     0
> 
>         17     0     0       1.1k         10     0       425k     0
> 
>         17     0     0       1.1k         10     0       432k     0
> 
>         17     0     0       1.1k         10     0       439k     0
> 
>         17     0     0       1.1k         10     0       452k     0
> 
>         16     0     0         1k         10     0       457k     0
> 
>         16     0     0         1k         10     0       467k     0
> 
>         16     0     0         1k         10     0       477k     0
> 
>         16     0     0         1k         10     0       481k     0
> 
>         16     0     0         1k         10     0       495k     0
> 
>         16     0     0         1k         10     0       498k     0
> 
>         16     0     0         1k         10     0       510k     0
> 
>         16     0     0         1k         10     0       515k     0
> 
>         16     0     0         1k         10     0       524k     0
> 
>         17     0     0       1.1k         10     0       532k     0
> 
>             input         (net1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>         17     0     0       1.1k         10     0       538k     0
> 
>         17     0     0       1.1k         10     0       548k     0
> 
>         17     0     0       1.1k         10     0       552k     0
> 
>         17     0     0       1.1k         10     0       562k     0
> 
>         17     0     0       1.1k         10     0       566k     0
> 
>         16     0     0         1k         10     0       576k     0
> 
>         16     0     0         1k         10     0       580k     0
> 
>         16     0     0         1k         10     0       590k     0
> 
>         17     0     0       1.1k         10     0       594k     0
> 
>         16     0     0         1k         10     0       603k     0
> 
>         16     0     0         1k         10     0       609k     0
> 
>         16     0     0         1k         10     0       614k     0
> 
>         16     0     0         1k         10     0       623k     0
> 
>         16     0     0         1k         10     0       626k     0
> 
>         17     0     0       1.1k         10     0       637k     0
> 
>         18     0     0       1.1k         10     0       637k     0
> 
>        17k     0     0       1.1M        34k     0       1.7G     0
> 
>        21k     0     0       1.4M        42k     0       2.1G     0
> 
>        20k     0     0       1.3M        39k     0         2G     0
> 
>        19k     0     0       1.2M        38k     0       1.9G     0
> 
>        20k     0     0       1.3M        41k     0       2.0G     0
> 
>             input         (net1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>        20k     0     0       1.3M        40k     0       2.0G     0
> 
>        19k     0     0       1.2M        38k     0       1.9G     0
> 
>        22k     0     0       1.5M        45k     0       2.3G     0
> 
>        20k     0     0       1.3M        40k     0       2.1G     0
> 
>        20k     0     0       1.3M        40k     0       2.1G     0
> 
>        18k     0     0       1.2M        36k     0       1.9G     0
> 
>        21k     0     0       1.4M        41k     0       2.1G     0
> 
>        22k     0     0       1.4M        44k     0       2.2G     0
> 
>        21k     0     0       1.4M        43k     0       2.2G     0
> 
>        20k     0     0       1.3M        41k     0       2.1G     0
> 
>        20k     0     0       1.3M        40k     0       2.0G     0
> 
>        21k     0     0       1.4M        43k     0       2.2G     0
> 
>        21k     0     0       1.4M        43k     0       2.2G     0
> 
>        20k     0     0       1.3M        40k     0       2.0G     0
> 
>        21k     0     0       1.4M        43k     0       2.2G     0
> 
>        19k     0     0       1.2M        38k     0       1.9G     0
> 
>        21k     0     0       1.4M        42k     0       2.1G     0
> 
>        20k     0     0       1.3M        40k     0       2.0G     0
> 
>        21k     0     0       1.4M        42k     0       2.1G     0
> 
>        20k     0     0       1.3M        40k     0       2.0G     0
> 
>        20k     0     0       1.3M        40k     0       2.0G     0
> 
>             input         (net1)           output
> 
>    packets  errs idrops      bytes    packets  errs      bytes colls
> 
>        24k     0     0       1.6M        48k     0       2.5G     0
> 
>       6.3k     0     0       417k        12k     0       647M     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> 
>          6     0     0        360          0     0          0     0
> _______________________________________________
> freebsd-net at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe at freebsd.org"
>