TCP Success Story (was Re: TCP_RACK, TCP_BBR, and firewalls)

From: Alan Somers <asomers_at_freebsd.org>
Date: Wed, 17 Jul 2024 20:00:31 UTC
On Sat, Jul 13, 2024 at 1:50 AM <tuexen@freebsd.org> wrote:
>
> > On 13. Jul 2024, at 01:43, Alan Somers <asomers@FreeBSD.org> wrote:
> >
> > I've been experimenting with RACK and BBR.  In my environment, they
> > can dramatically improve single-stream TCP performance, which is
> > awesome.  But pf interferes.  I have to disable pf in order for them
> > to work at all.
> >
> > Is this a known limitation?  If not, I will experiment some more to
> > determine exactly what aspect of my pf configuration is responsible.
> > If so, can anybody suggest what changes would have to happen to make
> > the two compatible?
> A problem with same symptoms was already reported and fixed in
> https://reviews.freebsd.org/D43769
>
> Which version are you using?
>
> Best regards
> Michael
> >
> > -Alan

TLDR; tcp_rack is good, cc_chd is better, and tcp_bbr is best

I want to follow up with the list to post my conclusions.  Firstly
tuexen@ helped me solve my problem: in FreeBSD 14.0 there is a 3-way
incompatibility between (tcp_bbr || tcp_rack) && lro && pf.  I can
confirm that tcp_bbr works for me if I either disable LRO, disable PF,
or switch to a 14.1 server.

Here's the real problem: on multiple production servers, downloading
large files (or ZFS send/recv streams) was slow.  After ruling out
many possible causes, wireshark revealed that the connection was
suffering about 0.05% packet loss.  I don't know the source of that
packet loss, but I don't believe it to be congestion-related.  Along
with a 54ms RTT, that's a fatal combination for the throughput of
loss-based congestion control algorithms.  According to the Mathis
Formula [1], I could only expect 1.1 MBps over such a connection.
That's actually worse than what I saw.  With default settings
(cc_cubic), I averaged 5.6 MBps.  Probably Mathis's assumptions are
outdated, but that's still pretty close for such a simple formula
that's 27 years old.

So I benchmarked all available congestion control algorithms for
single download streams.  The results are summarized in the table
below.

Algo    Packet Loss Rate    Average Throughput
vegas   0.05%               2.0 MBps
newreno 0.05%               3.2 MBps
cubic   0.05%               5.6 MBps
hd      0.05%               8.6 MBps
cdg     0.05%               13.5 MBps
rack    0.04%               14 MBps
htcp    0.05%               15 MBps
dctcp   0.05%               15 MBps
chd     0.05%               17.3 MBps
bbr     0.05%               29.2 MBps
cubic   10%                 159 kBps
chd     10%                 208 kBps
bbr     10%                 5.7 MBps

RACK seemed to achieve about the same maximum bandwidth as BBR, though
it took a lot longer to get there.  Also, with RACK, wireshark
reported about 10x as many retransmissions as dropped packets, which
is suspicious.

At one point, something went haywire and packet loss briefly spiked to
the neighborhood of 10%.  I took advantage of the chaos to repeat my
measurements.  As the table shows, all algorithms sucked under those
conditions, but BBR sucked impressively less than the others.

Disclaimer: there was significant run-to-run variation; the presented
results are averages.  And I did not attempt to measure packet loss
exactly for most runs; 0.05% is merely an average of a few selected
runs.  These measurements were taken on a production server running a
real workload, which introduces noise.  Soon I hope to have the
opportunity to repeat the experiment on an idle server in the same
environment.

In conclusion, while we'd like to use BBR, we really can't until we
upgrade to 14.1, which hopefully will be soon.  So in the meantime
we've switched all relevant servers from cubic to chd, and we'll
reevaluate BBR after the upgrade.

[1]: https://www.slac.stanford.edu/comp/net/wan-mon/thru-vs-loss.html

-Alan