Re: git: 0d7445193abc - main - tcp: remove tcptw, the compressed timewait state structure
Date: Fri, 07 Oct 2022 02:46:56 UTC
Hi On Fri, Oct 07, 2022 at 02:35:38AM +0000, Gleb Smirnoff wrote: T> The memory savings the tcptw brought back in 2003 (see 340c35de6a2) no T> longer justify the complexity required to maintain it. For longer T> explanation please check out the email [1]. T> T> Surpisingly through almost 20 years the TCP stack functionality of T> handling the TIME_WAIT state with a normal tcpcb did not bitrot. The T> existing tcp_input() properly handles a tcpcb in TCPS_TIME_WAIT state, T> which is confirmed by the packetdrill tcp-testsuite [2]. T> T> This change just removes tcptw and leaves INP_TIMEWAIT. The flag will T> be removed in a separate commit. This makes it easier to review and T> possibly debug the changes. T> T> [1] https://lists.freebsd.org/archives/freebsd-net/2022-January/001206.html T> [2] https://github.com/freebsd-net/tcp-testsuite T> T> Differential revision: https://reviews.freebsd.org/D36398 The memory savings cut here are more than just sizeof(struct tcpcb) - sizeof(struct tcptw). We also keep the socket around for duration of TIME_WAIT. And the stuff that may hang off the socket, e.g. kTLS context. I prepared a changeset that would free the socket when transitioning to TIME_WAIT: https://reviews.freebsd.org/D36887 However, the patch adds back some complexity that we just removed. It also introduces a problem with two recent features of pluggable TCP stacks and massive setsockopt: we got sysctl_setsockopt() that would run setsockopt on a pcb matching certain criteria. One of most common applications of this tool is to switch from one TCP stack to the other. To execute sosetopt() the sysctl_setsockopt() needs socket. With the compressed timewait structure, we didn't have it. Now we don't, too. However, the compressed timewaits were short-circuited to the default TCP stack and didn't depend on the pluggable ones. With a regular tcpcb we still reference the stack. That makes it impossible to switch off ALL existing connections from some stack with sysctl_setsockopt(). After todays discussion with mtuexen@, rscheff@ and peterlei@ we decided not to rush with checking in D36887 and have more discussion whether the memory savings worth the comlexity or not. But keep the revision around. -- Gleb Smirnoff