Window updates during periods of HIGH packet loss
cjharrer at comcast.net
cjharrer at comcast.net
Wed Jun 13 21:40:08 UTC 2012
Oops, left one part out below...
When snd_nxt == snd_una == snd_max; snd_wnd == 0 which is why we can't send any new data.
----- Original Message -----
From: cjharrer at comcast.net
To: freebsd-net at freebsd.org
Cc: "Christopher J. Harrer" <cjharrer at comcast.net>
Sent: Wednesday, June 13, 2012 5:02:57 PM
Subject: Window updates during periods of HIGH packet loss
Running FreeBSD 8.0 Stable and we're running into an issue with
window updates during periods of very high DUPLEX network
traffic where there are a good number of network packets being
dropped. Please let me know if there is a better list to ask this
question of.
I'm going to use some small, made up numbers to demonstrate what is
going on. I can go into more detail with explicit numbers from an
internal trace that I created, but it gets pretty long and tedious, so
I'd like to see if my example makes sense first.
We have a server that is running a lot of NFS traffic (NFSv3 over
TCP/IPv4) to a NetApp back-end filer (not sure the filer matters).
For the purpose of this problem description, let's assume that I have
a send and receive window of 10,000.
We have a lot of data outstanding in the network (let's say 9000 bytes),
the back-end filer (seq 1,000 to 9,999). The filer is sending us a lot
of data concurrent to our 9000 bytes we just sent. Lets assume our
rcv_nxt is 1,000.
We receive th_seq of 10,000 (out of order) from the filer and it
ACK's all of our oustanding data. So, snd_wl1 becomes 10,000 and
snd_wl2 becomes 9,999. Our snd_wnd is now 10000, so we begin to send
new data (again, we blast it out, so let's assume we have 10,000 more
bytes sent).
The filer is "resending" sequencne numbers 1,000 through 9,999 because the
new data we are sending contains SACK blocks instructing it to. The
retransmitted data we are receiving is also acking our new sent data such
that when we receive segment with th_seq 9,000 it goes up to 9999 (and
completes our out of order processing) all of our data is acked. Now,
here's where the problem arises:
1) in processing a WindowUpdate (step6 in tcp_input) the 2nd check that is
made is to ensure that tp->snd_wl1 < th->th_seq, in this case, it's not.
10,000 is not less than 9,000. The next check needs th->th_seq ==
tp->snd_wl1 which also fails, so no window update done.
2) After tcp_reass handles the receipt of the last segment that fills in the
"hole" in our stream, tp->t_flags |= TF_ACKNOW (this flag cause tcp_output
to skip the check to start the PERSIST timer, because it must force
a send (in this case, the send is just an ACK). Any time tcp_reass
returns TF_ACKNOW is set.
We've gotten a new send down while we were sending data into our open window,
so now we're stuck, tp->snd_nxt == tp->snd_una == tp->snd_max
and so_snd.sb_cc !=0, TT_PERSIST is NOT running and TT_REXMT is not running.
Eventually the filer sends us a FIN to close an "idle" client
connection; which is normal operation in this configuration.
I have not looked at more recent versions of FreeBSD code yet, I will start
doing that now. I just wanted to ask the experts if I'm missing something
here, it feels like I am.
Thanks in advance for any insight you can provide.
Regards,
Chris
More information about the freebsd-net
mailing list