recwin change

Scheffenegger, Richard Richard.Scheffenegger at netapp.com
Tue Apr 21 14:37:52 UTC 2020


Ah, sorry I must have misread earlier.

My test script for packetdrill, where I wanted to look into premature shrinking of the right edge of the receive window when scaling is in effect, when I try to clamp down the receive window really low, it is set to no less than 64kB:

[root at freebsd ~]# cat newreno-shrinking-window.pkt
// A simple server-side test that sends exactly an initial window (IW10)
// worth of packets.

--tolerance_usecs=500000

// Flush Hostcache
//0.0 `kldload cc_cubic`
0.0 `sysctl net.inet.tcp.cc.algorithm=newreno`
0.1 `sysctl net.inet.tcp.initcwnd_segments=10`
0.2 `sysctl net.inet.tcp.hostcache.purgenow=1`
0.3 `sysctl net.inet.tcp.rfc3465=0`
//0.3 `sync` // in case of crash

// Create a listening TCP socket.
0.50 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.005 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.005 setsockopt(3, SOL_SOCKET, SO_DEBUG, [1], 4) = 0
+0.005 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [1048576], 4) = 0
+0.005 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [70000], 4) = 0
+0.005 bind(3, ..., ...) = 0
+0.005 listen(3, 1) = 0

// Establish a TCP connection with ECN to explicitly track CWR
// Set WindowScale to multiplicative factor of 1kB to allow huge increase
+0.035 < S  0:0(0) win 65535 <mss 1460, sackOK, wscale 10, eol, nop, nop>
+0.000 > S. 0:0(0) ack 1 win 65535 <mss 1460,nop,wscale 6,sackOK,eol,eol>
+0.000 <  . 1:1(0) ack 1 win 65535
+0.000 accept(3, ..., ...) = 4

+0.005 setsockopt(4, SOL_SOCKET, SO_SNDBUF, [1048576], 4) = 0
+0.005 setsockopt(4, SOL_SOCKET, SO_RCVBUF, [10000], 4) = 0
//+0     >  . 1:1(0) ack 1

// Filling up the receive buffer
+0 < .     1:1461(1460) ack 1 win 65535
+0 < .  1461:2921(1460) ack 1 win 65535
+0 > .     1:1(0) ack 2921 win 978  // 62592

+0.005 setsockopt(4, SOL_SOCKET, SO_SNDBUF, [10000], 4) = 0
+0.005 setsockopt(4, SOL_SOCKET, SO_RCVBUF, [10000], 4) = 0

+0 < .  2921:4381(1460) ack 1 win 65535
+0 < .  4381:5841(1460) ack 1 win 65535
+0 > .     1:1(0) ack 5841 win 932  // 59648

+0 < .  5841:5999(158) ack 1 win 65535
+0 < P. 5999:6000(1) ack 1 win 65535
+0 > .     1:1(0) ack 6000 win 930  // 59520



Richard Scheffenegger
Consulting Solution Architect
NAS & Networking

NetApp
+43 1 3676 811 3157 Direct Phone
+43 664 8866 1857 Mobile Phone
Richard.Scheffenegger at netapp.com<mailto:Richard.Scheffenegger at netapp.com>

https://ts.la/richard49892


From: Jonathan Looney <jtl at netflix.com>
Sent: Dienstag, 21. April 2020 16:27
To: Scheffenegger, Richard <Richard.Scheffenegger at netapp.com>
Cc: transport at freebsd.org; Michael Tuexen <tuexen at freebsd.org>; Randall Stewart <rrs at netflix.com>; Lawrence Stewart <lstewart at netflix.com>; rgrimes at freebsd.org; Cui, Cheng <Cheng.Cui at netapp.com>
Subject: Re: recwin change

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.


On Tue, Apr 21, 2020 at 9:59 AM Scheffenegger, Richard <Richard.Scheffenegger at netapp.com<mailto:Richard.Scheffenegger at netapp.com>> wrote:
Hi Jonathan,

In your larger patch to fix up long int to int32_t

https://reviews.freebsd.org/rS306769#change-l6GoMSS8L7SS

You seem to have slipped in a functional change for the receive window:


-       recwin = sbspace(&so->so_rcv);
+       recwin = lmin(lmax(sbspace(&so->so_rcv), 0),
+           (long)TCP_MAXWIN << tp->rcv_scale);

While https://reviews.freebsd.org/D7073
Makes it clear that the lmax(sbspace(&so->so_rcv), 0) is to prevent any potential negative value to end up being signaled as a very large receive window.

However, that change also signals at least TCP_MAXWIN, even when the socket receive buffer may be much smaller.

I don't think I understand what you are suggesting. Can you give an example where this may occur?



And the typecast long was missed in your fix-up to get rid of all longs in the tcp stack 😉.

Actually, that was purposeful. Because this is being sent through a function which expects a long, this ensures the value will be treated as a long. It is probably unnecessary, but it shouldn't be harmful.

Jonathan


More information about the freebsd-transport mailing list