Re: About the behavior of tuning of tcp_reass_maxseg
- In reply to: Zhenlei Huang : "Re: About the behavior of tuning of tcp_reass_maxseg"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 04 Oct 2023 22:20:28 UTC
The *ratio* sounds good to me. But making it write tunable seems to be an overshot, as `kern.ipc.nmbclusters` is already a write tunable and it already has an event handler. If the *ratio* is write tunable, we may have to add a separate event handler, similar to nmbclusters_change. I searched around XNU, NetBSD, OpenBSD and DragonFlyBSD. None of them have such an admin interface to setup the global limit. On a second thought, I don't think the "auto-tuning" (actually overwriting) by nmbclusters_change event back to "nmbclusters / 16" is desirable when the "net.inet.tcp.reass.maxsegments" tunable value exists. That's why I think we keep the *ratio* read tunable, and update accordingly like this: EVENTHANDLER_REGISTER(nmbclusters_change, tcp_reass_zone_change, NULL, EVENTHANDLER_PRI_ANY); ... /* Initialize *update* TCP reassembly queue */ static void tcp_reass_zone_change(void *tag) { /* Set the zone limit and read back the effective value. */ tcp_reass_maxseg = nmbclusters / 16; tcp_reass_maxseg = nmbclusters * *ratio*; << update here too on nmbclusters_change event tcp_reass_maxseg = uma_zone_set_max(tcp_reass_zone, tcp_reass_maxseg); } Best Regards, Cheng Cui On Tue, Oct 3, 2023 at 9:20 PM Zhenlei Huang <zlei@freebsd.org> wrote: > > > On Oct 4, 2023, at 7:21 AM, Cheng Cui <cc@freebsd.org> wrote: > > Here are my short answers based on my understanding of the two > relevant commits: > commit 12e2e970515339729eccebdc751afb97fa0af6fa > Author: Andre Oppermann <andre@FreeBSD.org> > Date: Tue Feb 24 15:27:41 2004 +0000 > > commit 4f590175b7f3a4efecdd28ffd3306d30724d32c9 > Author: Paul Saab <ps@FreeBSD.org> > Date: Fri Apr 21 09:25:40 2006 +0000 > > >> So comes some questions. >> >> 1. Is the above behavior expected ? >> > Yes > > > Then is it better to convert `net.inet.tcp.reass.maxsegments` into a > *ratio* `net.inet.tcp.reass.maxsegments_ratio` ? > > We can have a default 1/16 (of nmbclusters), and it can be write tunable. > Also we can limit it to maximum of 15/16 (of nmbclusters) > so that we can always have free mbufs for connections. > > The auto-tuning can still be applied whenever nmbclusters changes. > > > 2. Is it desirable to make `net.inet.tcp.reass.maxsegments` write tunable ? >> > No, as it is indirectly tuned by nmbclusters. If it is writable, the > tcp_reass_zone has to be re-limited, which does not seem to be necessary. > > 3. Is it valid to set 0 to `net.inet.tcp.reass.maxsegments` ? IIUC that >> will effectually disable TCP reassembly. >> > Yes, if you mean set 0 to the tunable `net.inet.tcp.reass.maxsegments` in > /boot/loader.conf. I tested it with 0, and it looks like the system booted > well. But disabling it is not recommended as it conflicts with the protocol > principle. > > 4. Then can we introduce a constant `-1` to explicitly enable the 'auto >> tuning' feature ? > > I don't think it is necessary. > > Best Regards, > Cheng Cui > > > On Thu, Sep 28, 2023 at 12:19 AM Zhenlei Huang <zlei@freebsd.org> wrote: > >> Hi, >> >> I'm recently working on SYSCTLs features. While testing the sysctl >> knob `net.inet.tcp.reass.maxsegments` I got something weird. When >> I adjust `kern.ipc.nmbclusters`, the `net.inet.tcp.reass.maxsegments` >> got 'auto tuned' regardless what I have set (the kernel env) in >> `/boot/loader.conf`. >> >> The relevant code: >> >> ``` >> static int tcp_reass_maxseg = 0; >> SYSCTL_INT(_net_inet_tcp_reass, OID_AUTO, maxsegments, CTLFLAG_RDTUN, >> &tcp_reass_maxseg, 0, >> "Global maximum number of TCP Segments in Reassembly Queue"); >> >> tcp_reass_global_init(void) >> { >> >> tcp_reass_maxseg = nmbclusters / 16; >> TUNABLE_INT_FETCH("net.inet.tcp.reass.maxsegments", >> &tcp_reass_maxseg); >> tcp_reass_zone = uma_zcreate("tcpreass", sizeof (struct >> tseg_qent), >> NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); >> /* Set the zone limit and read back the effective value. */ >> tcp_reass_maxseg = uma_zone_set_max(tcp_reass_zone, >> tcp_reass_maxseg); >> ... >> >> EVENTHANDLER_REGISTER(nmbclusters_change, >> tcp_reass_zone_change, NULL, EVENTHANDLER_PRI_ANY); >> >> } >> >> /* Initialize TCP reassembly queue */ >> static void >> tcp_reass_zone_change(void *tag) >> { >> >> /* Set the zone limit and read back the effective value. */ >> tcp_reass_maxseg = nmbclusters / 16; >> tcp_reass_maxseg = uma_zone_set_max(tcp_reass_zone, >> tcp_reass_maxseg); >> } >> ``` >> >> The above code shows clearly the current behavior. >> >> What I expected is, kernel env has priority to the 'auto tuning', unless >> the admin requested. >> >> So comes some questions. >> >> 1. Is the above behavior expected ? >> 2. Is it desirable to make `net.inet.tcp.reass.maxsegments` write tunable >> ? >> 3. Is it valid to set 0 to `net.inet.tcp.reass.maxsegments` ? IIUC that >> will effectually disable TCP reassembly. >> 4. Then can we introduce a constant `-1` to explicitly enable the 'auto >> tuning' feature ? >> >> Best regards, >> Zhenlei >> >> >> > >