Re: Cwnd grows slowly during slow-start due to LRO of the receiver side.

From: Hans Petter Selasky <hps_at_selasky.org>
Date: Tue, 02 May 2023 10:04:33 UTC
On 5/2/23 11:14, Hans Petter Selasky wrote:
> Hi Chen!
> 
> The FreeBSD mbufs carry the number of ACKs that have been joined 
> together into the following field:
> 
> m->m_pkthdr.lro_nsegs
> 
> Can this value be of any use to cc_newreno ?
> 
> --HPS

Hi Chen,

Have you tested using FreeBSD main / 14 ?

The "nsegs" are passed along like this:

nsegs = max(1, m->m_pkthdr.lro_nsegs);

...

cc_ack_received(tp, th, nsegs, CC_ACK);

...

(Newreno - FreeBSD-14)

                                 incr = min(ccv->bytes_this_ack,
                                     ccv->nsegs * abc_val *
                                     CCV(ccv, t_maxseg));

And in FreeBSD-10 being mentioned in your article:

(Newreno - FreeBSD-10)

                                 incr = min(ccv->bytes_this_ack,
                                     V_tcp_abc_l_var * CCV(ccv, t_maxseg));


There is no such thing.

This issue may already have been fixed!

--HPS
> 
> On 5/2/23 09:46, Chen Shuo wrote:
>> As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
>> FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
>> That is, during slow-start, when receiving an ACK of 'bytes_acked'
>>
>>      cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt
>>
>> As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
>> the negative impact of the delayed ACK algorithm.  RFC 5681 also
>> requires that a receiver SHOULD generate an ACK for at least every
>> second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
>> If both sender and receiver follow it. cwnd should grow exponentially
>> during slow-slow:
>>
>>      cwnd *= 2    (per RTT)
>>
>> However, LRO and TSO are widely used today, so receiver may generate
>> much less ACKs than it used to do.  As I observed, Both FreeBSD and
>> Linux generates at most one ACK per segment assembled by LRO/GRO.
>> The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.
>>
>> Sending 1MB over a link of 100ms delay from FreeBSD 13.2:
>>
>>   0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
>> [mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
>>   0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
>> 65160, options [mss 1460,sackOK,TS val 563185696 ecr
>> 495212525,nop,wscale 7], length 0
>>   0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
>> val 495212626 ecr 563185696], length 0
>>   // TSopt omitted below for brevity.
>>
>>   // cwnd = 10 * MSS, sent 10 * MSS
>>   0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, 
>> length 14480
>>
>>   // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
>>   0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
>>   0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, 
>> length 17376
>>
>>   // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
>>   0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
>>   0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, 
>> length 20272
>>
>>   // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
>>   0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
>>   0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
>> length 21500
>>   0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, 
>> length 1448
>>
>> As a consequence, instead of growing exponentially, cwnd grows
>> more-or-less quadratically during slow-start, unless abc_l_var is
>> set to a sufficiently large value.
>>
>> NewReno took more than 20 seconds to ramp up throughput to 100Mbps
>> over an emulated 100ms delay link.  While Linux took ~2 seconds.
>> I can provide the pcap file if anyone is interested.
>>
>> Switching to CUBIC won't help, because it uses the logic in NewReno
>> ack_received() for slow start.
>>
>> Is this a well-known issue and abc_l_var is the only cure for it?
>> https://calomel.org/freebsd_network_tuning.html
>>
>> Thank you!
>>
>> Best,
>> Shuo Chen
>>
> 
>