From nobody Mon Dec 09 09:50:02 2024 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Y6HCq66gdz5g3Zt for ; Mon, 09 Dec 2024 09:50:15 +0000 (UTC) (envelope-from pavel@x3me.net) Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Y6HCp59tGz4h5d for ; Mon, 9 Dec 2024 09:50:14 +0000 (UTC) (envelope-from pavel@x3me.net) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=x3me.net header.s=google header.b=UZhUQBaF; spf=pass (mx1.freebsd.org: domain of pavel@x3me.net designates 2a00:1450:4864:20::530 as permitted sender) smtp.mailfrom=pavel@x3me.net; dmarc=pass (policy=none) header.from=x3me.net Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-5d3e9a88793so2162413a12.1 for ; Mon, 09 Dec 2024 01:50:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=x3me.net; s=google; t=1733737813; x=1734342613; darn=freebsd.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=51+lu+gjQ7KU3B8npzNpy74/izgyk+f0BwnCnYsPYuQ=; b=UZhUQBaFtb7QYAJ/P1pWCu12aQNPBYqVPQymkVzlVn24i3+6msjJogvtby6tCY+RlG A6yT8BrM1RO8OQns35rTbVFGrdXcpSYDGDLUi4YA7UBkBxoD1fERr0SIRj2IeSeVJeDn D6H2jbJLUuSNiUq5y0M4ma0tvOvZhTOzljJSLHDVjatxufthZevELAEa3DmF2FD/R9HK l6b09w4b4R7RC9/cNdYKXylEnvZWzF/QZvy4apFoyd04JS3rrLqSqN6nCi7c6p7cF0LU GaVdQJU6jSyqKhKHehHxka1OBo+qqlP00FIMUvn2YaH3RqzfNcwagSaBRJQnUd/OSdLs gQKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733737813; x=1734342613; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=51+lu+gjQ7KU3B8npzNpy74/izgyk+f0BwnCnYsPYuQ=; b=AczBmGgIaU4aw/brCmrGB3sw4xcD6LorgH6ZhTcXElNx65xGHZceof/MSLrw6Vq4o6 rnB6zTuCOEFedu8eweqCuDYoEA6A+B0TmsDG82MXd63VFeroNE+LCL5nYdlkBy8vXFQs vAVvjXAzTjFTPagypvBNNhXs/d/C7B4EZDzN5Zv8l5YsPLzkU6k8pulyNciaXEKAp+aq AZ8C6dXq6fBbhIcwg9cq5gwphU0AFSKhgFDuHSOZvFcDZStI6UFczmFdNtXgpojPLxsF XMJ+frZQ073hmXl9xHZ3NKExpQFdQeU6X+f3scoERC8xREaUk6usiqvqv9M5lHILjD5m O7LQ== X-Gm-Message-State: AOJu0Yw9rhnypYS9Q2DWgs/IQk7xmRxsPZ35W8XNJH0B9383pW5fU8WL wWUax26t4pzsfzzmNKY0rDcYnwAjWH/CD59ZcRT8ZRbIT2GcGBaAelS3ySILYCm2fy2DFcBgxcr nBZPeyvG9n3cJL54IBJjcxmvPrbzPMYan7B8Fs8kQla4u5ABX/k4= X-Gm-Gg: ASbGnctCq9OIAjaXzdY3IuIXz77Bxj5/KsMb2DoJ0dFf768kCA+nmLW68AHYHJwLyKj NNhmmVUExPkXcvYlE1QUrmN6leko+F5qH X-Google-Smtp-Source: AGHT+IHkZMx5TFw59Mi2OkdonUvWZpDSkFk8KrLMsgBPn5WLXxtLinC/pEm59AkmwW3vSxqRwVZKd2uwmxZd7pDdhTE= X-Received: by 2002:a05:6402:254b:b0:5d2:73b0:81ef with SMTP id 4fb4d7f45d1cf-5d3be766425mr13873355a12.22.1733737812868; Mon, 09 Dec 2024 01:50:12 -0800 (PST) List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 From: Pavel Vazharov Date: Mon, 9 Dec 2024 11:50:02 +0200 Message-ID: Subject: Interaction between the re-transmit and keep-alive logic. To: freebsd-net@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Spamd-Result: default: False [-3.98 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.98)[-0.982]; DMARC_POLICY_ALLOW(-0.50)[x3me.net,none]; R_DKIM_ALLOW(-0.20)[x3me.net:s=google]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; MISSING_XM_UA(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; ARC_NA(0.00)[]; RCVD_COUNT_ONE(0.00)[1]; MLMMJ_DEST(0.00)[freebsd-net@freebsd.org]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::530:from]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-net@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_TRACE(0.00)[x3me.net:+] X-Rspamd-Queue-Id: 4Y6HCp59tGz4h5d X-Spamd-Bar: --- Hi there, We are using the network stack of FreeBSD 13 on top of DPDK in our application. During the last tests in the lab I stumbled upon the following situation: 1. It's a test where 5000 parallel connections are opened by Apache Bench and each one downloads 1MB data. It causes the client NIC to start dropping packets due to overflows which is intentional behavior. 2. The server side is our application with the FreeBSD stack. The client side Ubuntu 24.04 with Linux 6.8.0. 3. So, a connection is opened and the download starts on it. At some point the first drops occur and according to the TCP dump, from the client side, they take a few seconds before the connection heals up. However, these drops lead to increased values of t_srtt, t_rttvar and thus to increased value of t_rxtcur. 4. The window opens again up to 100-200 KB with lots of packets in-flight and the drops start again. They cause the re-transmit timer from the FreeBSD side to be started but with an interval of something like 18-20 seconds (according to my printf debugging on this side). 5. At the same time the TCP keep-alive timer is also started for the same connection (it's enabled for all connections) with a timeout of 15 seconds. 6. Nothing happens on this connection for the next 15 seconds. I'm not sure why the Linux stack didn't send any "wake-up" ACK packets or something but the tcpdump from the client side shows full silence between 14-th and 29-th second. 7. Next the FreeBSD keep-alive logic kicks-in and sends an ACK packet which is ACK-ed by the Linux stack immediately. However, this ACK packet received by the FreeBSD stack leads to restart of the retransmit timer and with the interval which is bigger than the keep-alive interval. 8. Point 6 and 7 repeat one more time before the apache bench client gives up on this connection and declares that it's timed-out. My understanding is that the connection can "loop" in 6-7 for a very long time and a packet with data will never be retransmitted. 9. As far as I debugged the situation from the FreeBSD side the restart of the retransmit timer happens in the code after the `process_ACK` label, in the else branch here: ``` if (th->th_ack == tp->snd_max) { tcp_timer_activate(tp, TT_REXMT, 0); needoutput = 1; } else if (!tcp_timer_active(tp, TT_PERSIST)) tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); ``` So, based on the above situation I've the following questions: 1. Would it be correct if the re-transmit timer is not restarted by keep-alive ACK packets? 2. Assuming that the above change won't break anything else, is there a way for detecting that an ACK packet acknowledges previously sent keep-alive packet? Regards, Pavel.