Dropping TCP options from retransmitted SYNs considered harmful
George Neville-Neil
gnn at neville-neil.com
Tue Oct 16 02:11:38 UTC 2012
On Oct 15, 2012, at 09:08 , John Baldwin <jhb at freebsd.org> wrote:
> On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote:
>> Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted SYNs
>> starting with the 3rd SYN in this block in tcp_timer.c:
>>
>> /*
>> * Disable rfc1323 if we haven't got any response to
>> * our third SYN to work-around some broken terminal servers
>> * (most of which have hopefully been retired) that have bad VJ
>> * header compression code which trashes TCP segments containing
>> * unknown-to-them TCP options.
>> */
>> if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3))
>> tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP);
>>
>> There is even a PR for the original bug report: kern/1689
>>
>> However, there is an unintended consequence of this change that can be
>> disastrous. Specifically, suppose you have a FreeBSD client connecting to a
>> server, and that the SYNs are arriving at the server successfully, but the
>> first few return SYN/ACKs are dropped. Eventually a SYN/ACK makes it through
>> and the connection is established.
>>
>> The server (based on the first SYN it saw) believes it has negotiated window
>> scaling with the client. The client, however, has broken what it promised in
>> that first SYN and believes it is not using any window scaling at all. This
>> causes two forms of breakage:
>>
>> 1) When the server advertises a scaled window (e.g. '8' for a 64k window
>> scaled at 13), the client thinks it is an unscaled window ('8') and
>> sends data to the server very slowly.
>>
>> 2) When the client advertises an unscaled window (e.g. '65535' for a 64k
>> window), the server thinks it has a huge window (65535 << 13 == 511MB)
>> to send into.
>>
>> I'm not sure that 2) is a problem per se, but I have definitely seen instances
>> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and client
>> end of the connections to verify they disagreed on the scaling).
>>
>> The original motivation of this change is to work around broken terminal
>> servers that were old when this change was added in 2001. Over 10 years later
>> I think we should at least have an option to turn this work-around off, and
>> possibly disable it by default.
>>
>> Thoughts?
>
> How about this:
>
> Index: tcp_timer.c
> ===================================================================
> --- tcp_timer.c (revision 241579)
> +++ tcp_timer.c (working copy)
> @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, CTLFL
> /* max idle probes */
> int tcp_maxpersistidle;
>
> +static int tcp_rexmit_drop_options = 0;
> +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW,
> + &tcp_rexmit_drop_options, 0,
> + "Drop TCP options from 3rd and later retransmitted SYN");
> +
> static int per_cpu_timers = 0;
> SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW,
> &per_cpu_timers , 0, "run tcp timers on all cpus");
> @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp)
> * header compression code which trashes TCP segments containing
> * unknown-to-them TCP options.
> */
> - if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3))
> + if (tcp_rexmit_drop_options && (tp->t_state == TCPS_SYN_SENT) &&
> + (tp->t_rxtshift == 3))
> tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP);
> /*
> * If we backed off this far, our srtt estimate is probably bogus.
>
> Any other suggestions on the sysctl name?
The name's fine. Commit that sucker and turn it off.
Best,
George
More information about the freebsd-net
mailing list