From nobody Mon Dec 09 16:56:11 2024 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Y6SgL4yD8z5gX34 for ; Mon, 09 Dec 2024 16:56:14 +0000 (UTC) (envelope-from michael.tuexen@lurchi.franken.de) Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.franken.de", Issuer "Certum Domain Validation CA SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Y6SgL0dqkz4cX8 for ; Mon, 9 Dec 2024 16:56:14 +0000 (UTC) (envelope-from michael.tuexen@lurchi.franken.de) Authentication-Results: mx1.freebsd.org; none Received: from smtpclient.apple (unknown [IPv6:2003:a:e03:d412:bd32:e76c:e56f:de73]) (Authenticated sender: lurchi) by mail-n.franken.de (Postfix) with ESMTPSA id C33FD721E2806; Mon, 9 Dec 2024 17:56:11 +0100 (CET) Content-Type: text/plain; charset=us-ascii List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.200.121\)) Subject: Re: Interaction between the re-transmit and keep-alive logic. From: Michael Tuexen In-Reply-To: Date: Mon, 9 Dec 2024 17:56:11 +0100 Cc: freebsd-net@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <8A8F8BFA-B407-4F76-AE93-AD679961BCC4@lurchi.franken.de> References: To: Pavel Vazharov X-Mailer: Apple Mail (2.3826.200.121) X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:680, ipnet:193.174.0.0/15, country:DE] X-Rspamd-Queue-Id: 4Y6SgL0dqkz4cX8 X-Spamd-Bar: ---- > On 9. Dec 2024, at 10:50, Pavel Vazharov wrote: >=20 > Hi there, >=20 > We are using the network stack of FreeBSD 13 on top of DPDK in our = application. > During the last tests in the lab I stumbled upon the following = situation: > 1. It's a test where 5000 parallel connections are opened by Apache > Bench and each one downloads 1MB data. It causes the client NIC to > start dropping packets due to overflows which is intentional behavior. > 2. The server side is our application with the FreeBSD stack. The > client side Ubuntu 24.04 with Linux 6.8.0. > 3. So, a connection is opened and the download starts on it. At some > point the first drops occur and according to the TCP dump, from the > client side, they take a few seconds before the connection heals up. > However, these drops lead to increased values of t_srtt, t_rttvar and > thus to increased value of t_rxtcur. Do you observe increased values of t_rxtcur due to exponential backoff or due to extreme values of t_srtt and t_rttvar? >=20 > 4. The window opens again up to 100-200 KB with lots of packets > in-flight and the drops start again. They cause the re-transmit timer > from the FreeBSD side to be started but with an interval of something > like 18-20 seconds (according to my printf debugging on this side). > 5. At the same time the TCP keep-alive timer is also started for the > same connection (it's enabled for all connections) with a timeout of > 15 seconds. > 6. Nothing happens on this connection for the next 15 seconds. I'm not > sure why the Linux stack didn't send any "wake-up" ACK packets or > something but the tcpdump from the client side shows full silence > between 14-th and 29-th second. > 7. Next the FreeBSD keep-alive logic kicks-in and sends an ACK packet > which is ACK-ed by the Linux stack immediately. However, this ACK > packet received by the FreeBSD stack leads to restart of the > retransmit timer and with the interval which is bigger than the > keep-alive interval. > 8. Point 6 and 7 repeat one more time before the apache bench client > gives up on this connection and declares that it's timed-out. My > understanding is that the connection can "loop" in 6-7 for a very long > time and a packet with data will never be retransmitted. Can you provide a .pcap file? Best regards Michael > 9. As far as I debugged the situation from the FreeBSD side the > restart of the retransmit timer happens in the code after the > `process_ACK` label, in the else branch here: > ``` > if (th->th_ack =3D=3D tp->snd_max) { > tcp_timer_activate(tp, TT_REXMT, 0); > needoutput =3D 1; > } else if (!tcp_timer_active(tp, TT_PERSIST)) > tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur); > ``` >=20 > So, based on the above situation I've the following questions: > 1. Would it be correct if the re-transmit timer is not restarted by > keep-alive ACK packets? > 2. Assuming that the above change won't break anything else, is there > a way for detecting that an ACK packet acknowledges previously sent > keep-alive packet? >=20 > Regards, > Pavel. >=20