Re: low TCP speed, wrong rtt measurement

From: Peter 'PMc' Much <pmc_at_citylink.dinoex.sub.org>
Date: Wed, 07 Jun 2023 19:21:28 UTC
Folks,

  I can now say for certain that there is a flaw in the rtt
measurements.

On 2023-04-09, Scheffenegger, Richard <rscheff@freebsd.org> wrote:
> Finally, you could be using SIFTR to track the evolution of the minrtt 
> value over the course of the session.

> Although I suspect ultimately a tcpdump including the tcp header (-s 80) 
> , and the sifter internal state evolution would be optimal to 
> understanding when and why the RTT values go off the rails.

> At first glance, the ertt module may be prone to miscalculations, when 
> retransmissions are in play - no special precautions appear to be 
> present, to distinguish between the originally sent packet, and any 
> retransmission, nor any filtering of ACKs which come in as duplicates. 
> Thus there could be a scenario, where an ACK for a spurious 
> retransmission, e.g. due to reordering, could lead to a wrong baseline 
> RTT measurement, which is physically impossible on such a long distance 
> connection...

Richard, thanks for Your comments. It looks like You are right, and
Your debug advice sounds interesting - but for now I went a different way:

I put timestamping into the h_ertt code and reset the minrtt measurement
after an hour. I think this is a good idea in any case, as physics
might switch during a connection (only the way I did it, straightforward
inline comparing the timestamp at every occasion, is not really good
performance-wise), and mainly, it does already relieve my problem a bit.

On my May upload, the flaw did reproduce. Then I put in that patch.
Now on the June upload, the flaw does reproduce again, and the
patch fixes it after some time. Here is the evidence:

tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas       8712       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123299 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 120796 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14552       7240  1460 off

The cwin appears to be locked at 14552 and doesn't go higher.
And then the patch does it's action:

Jun  7 15:28:47 <kern.crit> edge kernel: [72317] h_ertt reset minrtt 1686087605, was 11 since 1686140926

This runs some 500 miles plus tunnels&filters, so rtt 11 is impossible
here. After resetting it, the cwin is normal again:

tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      20368      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      20368      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14528      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      17448      10136  1460 off
tcp4       0 123381 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      10160       8688  1460 off
tcp4       0 122117 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      14540       8688  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      20380       8688  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      20380       8688  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      15988      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      18908      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      20368      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      20368      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      21828      10136  1460 off
tcp4       0 113559 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      21828      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      23288      10136  1460 off
tcp4       0 123692 192.168.97.17.24379    192.168.99.1.9103      ESTABLISHED vegas      23288      10136  1460 off

So now, with timestamps provided, it makes sense to look into the
tcpdump. The next occasion occurs:

Jun  7 19:44:59 <kern.crit> edge kernel: [87689] h_ertt reset minrtt 1686087605, was 6 since 1686156298

18:44:56.561344 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271901977, win 1518, options [nop,nop,TS val 3960838455 ecr 2751963567], length 0
18:44:56.561438 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271913561:1271915009, ack 0, win 1027, options [nop,nop,TS val 2751963822 ecr 3960838455], length 1448
18:44:56.561446 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271915009:1271916457, ack 0, win 1027, options [nop,nop,TS val 2751963822 ecr 3960838455], length 1448
18:44:56.833069 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271904873, win 1540, options [nop,nop,TS val 3960838833 ecr 2751963582,nop,nop,sack 2 {1271915009:1271916457}{1271909217:1271910665}], length 0
18:44:56.833166 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271916457:1271917905, ack 0, win 1027, options [nop,nop,TS val 2751964093 ecr 3960838833], length 1448
18:44:56.833175 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271917905:1271919353, ack 0, win 1027, options [nop,nop,TS val 2751964093 ecr 3960838833], length 1448
18:44:56.924990 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271904873, win 1540, options [nop,nop,TS val 3960838924 ecr 2751963582,nop,nop,sack 2 {1271915009:1271917905}{1271909217:1271910665}], length 0
18:44:56.926649 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271904873, win 1540, options [nop,nop,TS val 3960838926 ecr 2751963582,nop,nop,sack 2 {1271915009:1271919353}{1271909217:1271910665}], length 0
18:44:56.926731 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271904873:1271906321, ack 0, win 1027, options [nop,nop,TS val 2751964187 ecr 3960838926], length 1448
18:44:57.939383 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271904873:1271906321, ack 0, win 1027, options [nop,nop,TS val 2751965203 ecr 3960838926], length 1448
18:44:57.988205 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271906321, win 1518, options [nop,nop,TS val 3960839988 ecr 2751965203,nop,nop,sack 2 {1271915009:1271919353}{1271909217:1271910665}], length 0
18:44:57.988316 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271906321:1271907769, ack 0, win 1027, options [nop,nop,TS val 2751965248 ecr 3960839988], length 1448
18:44:57.988326 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271907769:1271909217, ack 0, win 1027, options [nop,nop,TS val 2751965248 ecr 3960839988], length 1448
18:44:58.032928 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271907769, win 1518, options [nop,nop,TS val 3960840033 ecr 2751965248,nop,nop,sack 2 {1271915009:1271919353}{1271909217:1271910665}], length 0
18:44:58.033056 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271907769:1271909217, ack 0, win 1027, options [nop,nop,TS val 2751965293 ecr 3960840033], length 1448
18:44:58.033065 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271910665:1271912113, ack 0, win 1027, options [nop,nop,TS val 2751965293 ecr 3960840033], length 1448
18:44:58.033067 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271912113:1271913561, ack 0, win 1027, options [nop,nop,TS val 2751965293 ecr 3960840033], length 1448
18:44:58.033070 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271913561:1271915009, ack 0, win 1027, options [nop,nop,TS val 2751965293 ecr 3960840033], length 1448
18:44:58.042332 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271910665, win 1495, options [nop,nop,TS val 3960840042 ecr 2751965248,nop,nop,sack 1 {1271915009:1271919353}], length 0
18:44:58.077990 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271910665, win 1540, options [nop,nop,TS val 3960840078 ecr 2751965248,nop,nop,sack 2 {1271907769:1271909217}{1271915009:1271919353}], length 0
18:44:58.090877 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271912113, win 1518, options [nop,nop,TS val 3960840091 ecr 2751965293,nop,nop,sack 1 {1271915009:1271919353}], length 0
18:44:58.090972 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271912113:1271913561, ack 0, win 1027, options [nop,nop,TS val 2751965353 ecr 3960840091], length 1448

Here we sent.

18:44:58.090981 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271913561:1271915009, ack 0, win 1027, options [nop,nop,TS val 2751965353 ecr 3960840091], length 1448
18:44:58.098474 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271913561, win 1518, options [nop,nop,TS val 3960840098 ecr 2751965293,nop,nop,sack 1 {1271915009:1271919353}], length 0

And here we got ACK. This is actually 7.5 ms, well, near enough to 6.

18:44:58.098560 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271919353:1271920801, ack 0, win 1027, options [nop,nop,TS val 2751965358 ecr 3960840098], length 1448
18:44:58.107088 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271919353, win 1450, options [nop,nop,TS val 3960840107 ecr 2751965293], length 0
18:44:58.107167 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271920801:1271922249, ack 0, win 1027, options [nop,nop,TS val 2751965368 ecr 3960840107], length 1448
18:44:58.137900 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271919353, win 1540, options [nop,nop,TS val 3960840138 ecr 2751965293,nop,nop,sack 1 {1271912113:1271913561}], length 0

And here is the DSACK.

18:44:58.147276 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271919353, win 1540, options [nop,nop,TS val 3960840147 ecr 2751965293,nop,nop,sack 1 {1271913561:1271915009}], length 0
18:44:58.235268 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271922249, win 1518, options [nop,nop,TS val 3960840225 ecr 2751965358], length 0
18:44:58.235380 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271922249:1271923697, ack 0, win 1027, options [nop,nop,TS val 2751965497 ecr 3960840225], length 1448
18:44:58.235389 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271923697:1271925145, ack 0, win 1027, options [nop,nop,TS val 2751965497 ecr 3960840225], length 1448
18:44:58.235392 IP 192.168.97.17.24379 > 192.168.99.1.9103: Flags [.], seq 1271925145:1271926593, ack 0, win 1027, options [nop,nop,TS val 2751965497 ecr 3960840225], length 1448
18:44:58.305514 IP 192.168.99.1.9103 > 192.168.97.17.24379: Flags [.], ack 1271925145, win 1518, options [nop,nop,TS val 3960840293 ecr 2751965497], length 0

So, what do we do now? This doesn't look very easy to fix. Any
ideas, anybody?

cheerio,
PMc