TCP window updates combined with dup acks sent in response to packet loss

Lawrence Stewart lstewart at room52.net
Wed Mar 11 14:43:35 UTC 2015


[BCCed to freebsd-net at freebsd.org]

Hi all,

Please consider the code below from FreeBSD 10.1's TCP input processing:


        /*
         * In ESTABLISHED state: drop duplicate ACKs; ACK out of range
         * ACKs.  If the ack is in the range
         *      tp->snd_una < th->th_ack <= tp->snd_max
         * then advance tp->snd_una to th->th_ack and drop
         * data from the retransmission queue.  If this ACK reflects
         * more up to date window information we update our window
information.
         */
        case TCPS_ESTABLISHED:
        case TCPS_FIN_WAIT_1:
        case TCPS_FIN_WAIT_2:
        case TCPS_CLOSE_WAIT:
        case TCPS_CLOSING:
        case TCPS_LAST_ACK:
                if (SEQ_GT(th->th_ack, tp->snd_max)) {
                        TCPSTAT_INC(tcps_rcvacktoomuch);
                        goto dropafterack;
                }
                if ((tp->t_flags & TF_SACK_PERMIT) &&
                    ((to.to_flags & TOF_SACK) ||
                     !TAILQ_EMPTY(&tp->snd_holes)))
                        tcp_sack_doack(tp, &to, th->th_ack);

                /* Run HHOOK_TCP_ESTABLISHED_IN helper hooks. */
                hhook_run_tcp_est_in(tp, th, &to);

                if (SEQ_LEQ(th->th_ack, tp->snd_una)) {
                        if (tlen == 0 && tiwin == tp->snd_wnd) {
                                TCPSTAT_INC(tcps_rcvdupack);

                                <dupack processing omitted>


Now for a dupack to be treated as such for the purposes of triggering
fast retransmit/fast recovery, the dupack must not update the window as
per the "tiwin == tp->snd_wnd" condition above, which has existed since
at least BSD 4.4 (I didn't bother looking further back in history).

Grenville (CCed) has encountered proof of this condition forcing
connections to recover via RTO when legitimate dupacks sent in response
to a packet loss also contain a window update.

In the example tcpdump excerpt attached, the advertised receive window
is growing as the Linux receiver side app reads data from the socket
buffer while the dupacks are generated. The FreeBSD sender side sees
them as window updates and processes them as such, bypassing the dupack
processing code. The send window is consumed and because only dup acks
are returning, UNA is not advanced and so the connection stalls,
requiring two RTOs to recover from the 2 dropped packets.

With SACK in the mix as is the case for the provided example, it seems
obvious to me that the existence of SACK blocks could and should be used
to realise that a window update combined with a dupack is not a pure
window update and therefore should be processed by the dupack handling
code for fast retransmit/fast recovery purposes.

For connections without SACK, is there any existing guidance on how to
deal with this? My Google fu and skimming of the RFCs that seemed
relevant to this matter turned up nothing, and so I'm imagining a
potential algorithm for inferring if a window update can count as a
dupack for fast retransmit/fast recovery purposes or not. Wanted to get
some input from others before spending any more time on it though.

Cheers,
Lawrence
-------------- next part --------------
00:00:00.000241 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 7734, options [nop,nop,TS val 26486191 ecr 4183194278], length 0
00:00:00.000216 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 7824, options [nop,nop,TS val 26486191 ecr 4183194278,nop,nop,sack 1 {110282:111730}], length 0
00:00:00.008025 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 7915, options [nop,nop,TS val 26486199 ecr 4183194278,nop,nop,sack 2 {113178:114626}[|tcp]>
00:00:00.000233 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8005, options [nop,nop,TS val 26486200 ecr 4183194278,nop,nop,sack 2 {113178:116074}[|tcp]>
00:00:00.000248 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8096, options [nop,nop,TS val 26486200 ecr 4183194278,nop,nop,sack 2 {113178:117522}[|tcp]>
00:00:00.000217 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8186, options [nop,nop,TS val 26486200 ecr 4183194278,nop,nop,sack 2 {113178:118970}[|tcp]>
00:00:00.000269 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8277, options [nop,nop,TS val 26486200 ecr 4183194278,nop,nop,sack 2 {113178:120418}[|tcp]>
00:00:00.000242 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8367, options [nop,nop,TS val 26486201 ecr 4183194278,nop,nop,sack 2 {113178:121866}[|tcp]>
00:00:00.000250 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8458, options [nop,nop,TS val 26486201 ecr 4183194278,nop,nop,sack 2 {113178:123314}[|tcp]>
00:00:00.000209 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8548, options [nop,nop,TS val 26486201 ecr 4183194278,nop,nop,sack 2 {113178:124762}[|tcp]>
00:00:00.000263 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8639, options [nop,nop,TS val 26486201 ecr 4183194278,nop,nop,sack 2 {113178:126210}[|tcp]>
00:00:00.000244 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8729, options [nop,nop,TS val 26486202 ecr 4183194278,nop,nop,sack 2 {113178:127658}[|tcp]>
00:00:00.000243 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8820, options [nop,nop,TS val 26486202 ecr 4183194278,nop,nop,sack 2 {113178:129106}[|tcp]>
00:00:00.000218 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 108834, win 8910, options [nop,nop,TS val 26486202 ecr 4183194278,nop,nop,sack 2 {113178:130554}[|tcp]>
00:00:00.282769 IP 172.16.11.63.82 > 172.16.10.60.51677: Flags [.], seq 108834:110282, ack 100, win 1040, options [nop,nop,TS val 4183194632 ecr 26486202], length 1448
00:00:00.002284 IP 172.16.10.60.51677 > 172.16.11.63.82: Flags [.], ack 111730, win 9001, options [nop,nop,TS val 26486487 ecr 4183194632,nop,nop,sack 1 {113178:130554}], length 0
00:00:00.331428 IP 172.16.11.63.82 > 172.16.10.60.51677: Flags [.], seq 111730:113178, ack 100, win 1040, options [nop,nop,TS val 4183194966 ecr 26486487], length 1448


More information about the freebsd-net mailing list