LRO causing stretch ACK violations interacts badly with delayed ACKing
Colin Percival
cperciva at freebsd.org
Thu Oct 17 22:04:57 UTC 2013
Hi all,
I know {TSO, LRO, ACKing policy} has been discussed here recently, and I don't
want to rehash everything, but I'm seeing some very bad misbehaviour with LRO
and delayed ACKing turned on.
Running 'fetch -o /dev/null https://www.amazon.com/' on an EC2 instance running
FreeBSD 10.0-BETA1:
> 00:00:00.000000 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [S], seq 3375534763, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 1606713 ecr 0], length 0
> 00:00:00.000754 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [S.], seq 3035209700, ack 3375534764, win 8190, options [mss 1460,nop,wscale 6], length 0
> 00:00:00.000788 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [.], ack 1, win 1026, length 0
> 00:00:00.002003 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [.], ack 1, win 127, length 0
> 00:00:00.028959 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [P.], seq 1:318, ack 1, win 1026, length 317
> 00:00:00.029884 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [.], ack 318, win 108, length 0
> 00:00:00.029925 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [.], seq 1:4097, ack 318, win 108, length 4096
Amazon's SSL certificate is too large to fit into their initial send window, and
despite the fact that FreeBSD has received 4096 bytes (more than 2 MSS, and thus
enough that it SHOULD send an ACK according to RFC 2581) we delay that ACK here.
100 ms later, our delayed-ACK timer fires, we send the ACK, Amazon's TCP stack
finishes sending their SSL certificate, and everything starts moving again.
> 00:00:00.129497 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [.], ack 4097, win 1026, length 0
> 00:00:00.130258 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [P.], seq 4097:4241, ack 318, win 108, length 144
> 00:00:00.131332 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [P.], seq 318:632, ack 4241, win 1026, length 314
> 00:00:00.136398 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [P.], seq 4241:4288, ack 632, win 128, length 47
> 00:00:00.136773 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [P.], seq 632:1011, ack 4288, win 1026, length 379
> 00:00:00.141006 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [F.], seq 4867, ack 1011, win 144, length 0
> 00:00:00.141022 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [.], ack 4288, win 1026, length 0
> 00:00:00.141033 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [P.], seq 4288:4867, ack 1011, win 144, length 579
> 00:00:00.141059 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [.], ack 4868, win 1017, length 0
> 00:00:00.141167 IP 10.142.128.54.13252 > 176.32.98.166.443: Flags [P.], seq 1011:1038, ack 4868, win 1026, length 27
> 00:00:00.142036 IP 176.32.98.166.443 > 10.142.128.54.13252: Flags [R], seq 3035214568, win 9700, length 0
Out of 142 ms that this TCP connection is alive, 100 ms was wasted. This seems
like something which ought to be fixed...
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
More information about the freebsd-net
mailing list