From nobody Thu Jul 18 18:37:45 2024 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WQ1lB69VWz5Q8ls for ; Thu, 18 Jul 2024 18:37:58 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ua1-f44.google.com (mail-ua1-f44.google.com [209.85.222.44]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WQ1lB4VJFz4q3c; Thu, 18 Jul 2024 18:37:58 +0000 (UTC) (envelope-from asomers@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ua1-f44.google.com with SMTP id a1e0cc1a2514c-821db15a930so375474241.1; Thu, 18 Jul 2024 11:37:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721327877; x=1721932677; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9Xskhp8JiX2ePIUVLp1yUNj4v+Qc7CHphQ2kMceE+ck=; b=FV5gjCp6LfRXAJV15Zww4OCy5refSbiaaqOln8A02zVRhDnbI+lTbmZ6Kh9Mr3uYLH NQXnNvQMblEMGKtUzoXaP6o+cAmyq8ok+GJqIyzcn6QEo04L/Sunghhmb9A0dyRgGe5f 01EPAE9iIfn9sjT3ku98/1pM3U9HWpCDKz2jMIoRijpa39gl6jdrpEGbyj4UaFjREq3D Ofljss55X1ZL4XX4ASrWq7BP/1vraRj5TN+gmCLXe9B1bGDxc4F3hYErwyyeMaO1A0wH 3+B0Z4/9zpM1QoZTeqH7e0Qr7ys5iN+/k3odwwLYjsrCaPIkisDDCBid03hV1xX6Zb9R EcgA== X-Forwarded-Encrypted: i=1; AJvYcCUAWOIXOH6/j9MmbZjVtS4RpUJ6yTWv0oJ63ibAvQPcx2uWAlW78YIsoyKwEbMZ+hYAy3ZWp+nYiLcTWG99P1CZMyy+BsK5Qg== X-Gm-Message-State: AOJu0YzJ+0sYMm1bI3zbzzsvMsrMeSl35c8XSA2s2yODZnjo/HSP+NkC iK1gmJExXXJ/e6/4zuzCaoNnFZ/FyXPNHiigqbnjYJHMPiecbHiRtGjzXZC4167yqAhMF9YDlQr bIWit6Dchr6zrqUaP2SG5XvUpAA6vvw== X-Google-Smtp-Source: AGHT+IF/6mZ6bb0RWrvWOvgr53MQJlTL0f7q1qgWxooFaCWPqtzBixJN78rMpwUgbIs1TwqlFEPCjwJBDM+33bKnPRE= X-Received: by 2002:a05:6102:5487:b0:48f:e655:fad9 with SMTP id ada2fe7eead31-49159a62193mr6602561137.33.1721327877281; Thu, 18 Jul 2024 11:37:57 -0700 (PDT) List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Thu, 18 Jul 2024 12:37:45 -0600 Message-ID: Subject: Re: TCP Success Story (was Re: TCP_RACK, TCP_BBR, and firewalls) To: tuexen@freebsd.org Cc: Junho Choi , FreeBSD Net Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US] X-Rspamd-Queue-Id: 4WQ1lB4VJFz4q3c Coexist how? Do you mean that one socket can use one and a different socket uses the other? That makes sense. On Thu, Jul 18, 2024 at 10:34=E2=80=AFAM wrote: > > > On 18. Jul 2024, at 15:00, Junho Choi wrote: > > > > Alan - this is a great result to see. Thanks for experimenting. > > > > Just curious why bbr and rack don't co-exist? Those are two separate th= ings. > > Is it a current bug or by design? > Technically RACK and BBR can coexist. The problem was with pf and/or LRO. > > But this is all fixed now in 14.1 and head. > > Best regards > Michael > > > > BR, > > > > On Thu, Jul 18, 2024 at 5:27=E2=80=AFAM wrote: > >> On 17. Jul 2024, at 22:00, Alan Somers wrote: > >> > >> On Sat, Jul 13, 2024 at 1:50=E2=80=AFAM wrote: > >>> > >>>> On 13. Jul 2024, at 01:43, Alan Somers wrote: > >>>> > >>>> I've been experimenting with RACK and BBR. In my environment, they > >>>> can dramatically improve single-stream TCP performance, which is > >>>> awesome. But pf interferes. I have to disable pf in order for them > >>>> to work at all. > >>>> > >>>> Is this a known limitation? If not, I will experiment some more to > >>>> determine exactly what aspect of my pf configuration is responsible. > >>>> If so, can anybody suggest what changes would have to happen to make > >>>> the two compatible? > >>> A problem with same symptoms was already reported and fixed in > >>> https://reviews.freebsd.org/D43769 > >>> > >>> Which version are you using? > >>> > >>> Best regards > >>> Michael > >>>> > >>>> -Alan > >> > >> TLDR; tcp_rack is good, cc_chd is better, and tcp_bbr is best > >> > >> I want to follow up with the list to post my conclusions. Firstly > >> tuexen@ helped me solve my problem: in FreeBSD 14.0 there is a 3-way > >> incompatibility between (tcp_bbr || tcp_rack) && lro && pf. I can > >> confirm that tcp_bbr works for me if I either disable LRO, disable PF, > >> or switch to a 14.1 server. > >> > >> Here's the real problem: on multiple production servers, downloading > >> large files (or ZFS send/recv streams) was slow. After ruling out > >> many possible causes, wireshark revealed that the connection was > >> suffering about 0.05% packet loss. I don't know the source of that > >> packet loss, but I don't believe it to be congestion-related. Along > >> with a 54ms RTT, that's a fatal combination for the throughput of > >> loss-based congestion control algorithms. According to the Mathis > >> Formula [1], I could only expect 1.1 MBps over such a connection. > >> That's actually worse than what I saw. With default settings > >> (cc_cubic), I averaged 5.6 MBps. Probably Mathis's assumptions are > >> outdated, but that's still pretty close for such a simple formula > >> that's 27 years old. > >> > >> So I benchmarked all available congestion control algorithms for > >> single download streams. The results are summarized in the table > >> below. > >> > >> Algo Packet Loss Rate Average Throughput > >> vegas 0.05% 2.0 MBps > >> newreno 0.05% 3.2 MBps > >> cubic 0.05% 5.6 MBps > >> hd 0.05% 8.6 MBps > >> cdg 0.05% 13.5 MBps > >> rack 0.04% 14 MBps > >> htcp 0.05% 15 MBps > >> dctcp 0.05% 15 MBps > >> chd 0.05% 17.3 MBps > >> bbr 0.05% 29.2 MBps > >> cubic 10% 159 kBps > >> chd 10% 208 kBps > >> bbr 10% 5.7 MBps > >> > >> RACK seemed to achieve about the same maximum bandwidth as BBR, though > >> it took a lot longer to get there. Also, with RACK, wireshark > >> reported about 10x as many retransmissions as dropped packets, which > >> is suspicious. > >> > >> At one point, something went haywire and packet loss briefly spiked to > >> the neighborhood of 10%. I took advantage of the chaos to repeat my > >> measurements. As the table shows, all algorithms sucked under those > >> conditions, but BBR sucked impressively less than the others. > >> > >> Disclaimer: there was significant run-to-run variation; the presented > >> results are averages. And I did not attempt to measure packet loss > >> exactly for most runs; 0.05% is merely an average of a few selected > >> runs. These measurements were taken on a production server running a > >> real workload, which introduces noise. Soon I hope to have the > >> opportunity to repeat the experiment on an idle server in the same > >> environment. > >> > >> In conclusion, while we'd like to use BBR, we really can't until we > >> upgrade to 14.1, which hopefully will be soon. So in the meantime > >> we've switched all relevant servers from cubic to chd, and we'll > >> reevaluate BBR after the upgrade. > > Hi Alan, > > > > just to be clear: the version of BBR currently implemented is > > BBR version 1, which is known to be unfair in certain scenarios. > > Google is still working on BBR to address this problem and improve > > it in other aspects. But there is no RFC yet and the updates haven't > > been implemented yet in FreeBSD. > > > > Best regards > > Michael > >> > >> [1]: https://www.slac.stanford.edu/comp/net/wan-mon/thru-vs-loss.html > >> > >> -Alan > > > > > > > > > > -- > > Junho Choi | https://saturnsoft.net >