From nobody Wed Jul 20 13:10:41 2022 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Lnx020cGLz4X137; Wed, 20 Jul 2022 13:10:42 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Lnx016p6Tz3Wqd; Wed, 20 Jul 2022 13:10:41 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1658322642; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=L3vxEDCWPYpmH6kR8GjJD7Aq9Sv/QZrI6z1OlNBlpIY=; b=j2u+EcK+8zSm0LwM07O/TE1z3ip5wgXkZwkXXwHO8OTz7hw7+UJPk5Yn6rTQ/RUVCH2w/D vAO/fEH4/OHh4dqjKQouPgUQKSVUMXgfdw7aR73pjjKLrF6gLEwWNApcGFEBrdr9+TAz6E Dnvrq7P6HH0eX6XKALnLQgE2V2W5A+GtpK8/vuX5YiekvMiZYMnOaFUBPDRISWv5xphEHj dfwMb4HVZdRJFxo84llsjltf5riJdsDemaKCXSZCGdkBJHxAVlAhb75UhKoQZG8rm0QUtm juspAFgDmp9zyQuPaMLCb9cxvmlVaX/aEYEiRRteb82zqrMWCifdmxy3dZlKuQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Lnx015hc2zyDT; Wed, 20 Jul 2022 13:10:41 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 26KDAfNu039648; Wed, 20 Jul 2022 13:10:41 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 26KDAfBS039647; Wed, 20 Jul 2022 13:10:41 GMT (envelope-from git) Date: Wed, 20 Jul 2022 13:10:41 GMT Message-Id: <202207201310.26KDAfBS039647@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mike Karels Subject: git: a795c6e93444 - main - tcp.4: Sort sysctl variables List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: karels X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: a795c6e9344406e055d38bd94eac40e8f815a0ef Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1658322642; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=L3vxEDCWPYpmH6kR8GjJD7Aq9Sv/QZrI6z1OlNBlpIY=; b=irUdUMSZD/O1hiU7c4+nILHUa7IdoNg7ml5ax4CQLCnsYd7dT7k6z5G2ZjfuZAd8OMLbNl M9QdqV2rbgY8UI3jhRFasp7S7t4FSzQKdmOMtWWmakfvBukQzzQj73cHMFUk/9hlh2XjTc T5kJWe22T2Q2HPU+ELMFR543XAe7BwjStpInJ8msW8VTzsQqyLUX+NsvCESDz+d/C9Cmes tlLCXTnb+rRV+2W4OjmC5W9dym231LL5+GCUeAPVgS3kevEL904UuFv1cYjpOMzTAb28ZK nm0aVsU0YJ+FqmoAJ5fUg4sWWTHF0USrRQG8kmRGWgEEsicXVAp1aHt1+t7UMA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1658322642; a=rsa-sha256; cv=none; b=TzYXk/TaYO7THkZDJuFO9GTqh4Oe9qCXz6yLSCaE4LSvANrnfKY6NjnkI26Abh1O3OwsEj WyTn3xh5+VLhr4LPCEEGEDGEKXimk/GkulrWdP3skY/eFmyKBU/hxHPYGJzpPwnWhtblfs RDVPExz2x4/LfJwOqMVOEzeOHAUjnfXJ9uISUXe0ZhiQHEJ5v1ViNUriPDMkJ8BE4+RX8z ZDPKoJFFCy7znRpHb6UiN7dX3Qfpowjz8RX+1ZS92VqXZIKoWNPGKAKx7/93ctFG+cW8iV 4KE01TCHnOZ+vdzSlkk2S3F94NKXW+3sFFPq0sUTjSkCN/PseetOTDbEM9Bsgg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by karels: URL: https://cgit.FreeBSD.org/src/commit/?id=a795c6e9344406e055d38bd94eac40e8f815a0ef commit a795c6e9344406e055d38bd94eac40e8f815a0ef Author: Mike Karels AuthorDate: 2022-07-18 16:39:03 +0000 Commit: Mike Karels CommitDate: 2022-07-20 13:09:09 +0000 tcp.4: Sort sysctl variables In preparation for updates including missing variables, sort the sysctl variables in the MIB variables section alphabetically. Add a new "hostcache" entry for the hostcache node, containing the intro text that was previously in hostcache.enable. Also cleanups per review comments. Reviewed by: transport(tuexen), manpages(bcr) Differential Revision: https://reviews.freebsd.org/D35844 MFC after: 1 week (cherry picked from commit 5cf709ce72c0b6eb4b4d57db015a65f8a84166d5) --- share/man/man4/tcp.4 | 633 ++++++++++++++++++++++++++------------------------- 1 file changed, 325 insertions(+), 308 deletions(-) diff --git a/share/man/man4/tcp.4 b/share/man/man4/tcp.4 index 307750a95c7f..483984112031 100644 --- a/share/man/man4/tcp.4 +++ b/share/man/man4/tcp.4 @@ -34,7 +34,7 @@ .\" From: @(#)tcp.4 8.1 (Berkeley) 6/5/93 .\" $FreeBSD$ .\" -.Dd July 14, 2022 +.Dd July 20, 2022 .Dt TCP 4 .Os .Sh NAME @@ -422,64 +422,6 @@ branch of the .Xr sysctl 3 MIB. .Bl -tag -width ".Va v6pmtud_blackhole_mss" -.It Va rfc1323 -Implement the window scaling and timestamp options of RFC 1323/RFC 7323 -(default is true). -.It Va tolerate_missing_ts -Tolerate the missing of timestamps (RFC 1323/RFC 7323) for -.Tn TCP -segments belonging to -.Tn TCP -connections for which support of -.Tn TCP -timestamps has been negotiated. -As of June 2021, several TCP stacks are known to violate RFC 7323, including -modern widely deployed ones. -Therefore the default is 1, i.e., the missing of timestamps is tolerated. -.It Va mssdflt -The default value used for the maximum segment size -.Pq Dq MSS -when no advice to the contrary is received from MSS negotiation. -.It Va sendspace -Maximum -.Tn TCP -send window. -.It Va recvspace -Maximum -.Tn TCP -receive window. -.It Va log_in_vain -Log any connection attempts to ports where there is not a socket -accepting connections. -The value of 1 limits the logging to -.Tn SYN -(connection establishment) packets only. -That of 2 results in any -.Tn TCP -packets to closed ports being logged. -Any value unlisted above disables the logging -(default is 0, i.e., the logging is disabled). -.It Va msl -The Maximum Segment Lifetime, in milliseconds, for a packet. -.It Va keepinit -Timeout, in milliseconds, for new, non-established -.Tn TCP -connections. -The default is 75000 msec. -.It Va keepidle -Amount of time, in milliseconds, that the connection must be idle -before keepalive probes (if enabled) are sent. -The default is 7200000 msec (2 hours). -.It Va keepintvl -The interval, in milliseconds, between keepalive probes sent to remote -machines, when no response is received on a -.Va keepidle -probe. -The default is 75000 msec. -.It Va keepcnt -Number of probes sent, with no response, before a connection -is dropped. -The default is 8 packets. .It Va always_keepalive Assume that .Dv SO_KEEPALIVE @@ -488,115 +430,15 @@ is set on all connections, the kernel will periodically send a packet to the remote host to verify the connection is still up. -.It Va icmp_may_rst -Certain -.Tn ICMP -unreachable messages may abort connections in -.Tn SYN-SENT -state. -.It Va do_tcpdrain -Flush packets in the -.Tn TCP -reassembly queue if the system is low on mbufs. .It Va blackhole If enabled, disable sending of RST when a connection is attempted to a port where there is not a socket accepting connections. See .Xr blackhole 4 . -.It Va delayed_ack -Delay ACK to try and piggyback it onto a data packet. .It Va delacktime Maximum amount of time, in milliseconds, before a delayed ACK is sent. -.It Va path_mtu_discovery -Enable Path MTU Discovery. -.It Va tcbhashsize -Size of the -.Tn TCP -control-block hash table -(read-only). -This may be tuned using the kernel option -.Dv TCBHASHSIZE -or by setting -.Va net.inet.tcp.tcbhashsize -in the -.Xr loader 8 . -.It Va pcbcount -Number of active process control blocks -(read-only). -.It Va syncookies -Determines whether or not -.Tn SYN -cookies should be generated for outbound -.Tn SYN-ACK -packets. -.Tn SYN -cookies are a great help during -.Tn SYN -flood attacks, and are enabled by default. -(See -.Xr syncookies 4 . ) -.It Va isn_reseed_interval -The interval (in seconds) specifying how often the secret data used in -RFC 1948 initial sequence number calculations should be reseeded. -By default, this variable is set to zero, indicating that -no reseeding will occur. -Reseeding should not be necessary, and will break -.Dv TIME_WAIT -recycling for a few minutes. -.It Va reass.cursegments -The current total number of segments present in all reassembly queues. -.It Va reass.maxsegments -The maximum limit on the total number of segments across all reassembly -queues. -The limit can be adjusted as a tunable. -.It Va reass.maxqueuelen -The maximum number of segments allowed in each reassembly queue. -By default, the system chooses a limit based on each TCP connection's -receive buffer size and maximum segment size (MSS). -The actual limit applied to a session's reassembly queue will be the lower of -the system-calculated automatic limit and the user-specified -.Va reass.maxqueuelen -limit. -.It Va rexmit_initial , rexmit_min , rexmit_slop -Adjust the retransmit timer calculation for -.Tn TCP . -The slop is -typically added to the raw calculation to take into account -occasional variances that the -.Tn SRTT -(smoothed round-trip time) -is unable to accommodate, while the minimum specifies an -absolute minimum. -While a number of -.Tn TCP -RFCs suggest a 1 -second minimum, these RFCs tend to focus on streaming behavior, -and fail to deal with the fact that a 1 second minimum has severe -detrimental effects over lossy interactive connections, such -as a 802.11b wireless link, and over very fast but lossy -connections for those cases not covered by the fast retransmit -code. -For this reason, we use 200ms of slop and a near-0 -minimum, which gives us an effective minimum of 200ms (similar to -.Tn Linux ) . -The initial value is used before an RTT measurement has been performed. -.It Va initcwnd_segments -Enable the ability to specify initial congestion window in number of segments. -The default value is 10 as suggested by RFC 6928. -Changing the value on fly would not affect connections using congestion window -from the hostcache. -Caution: -This regulates the burst of packets allowed to be sent in the first RTT. -The value should be relative to the link capacity. -Start with small values for lower-capacity links. -Large bursts can cause buffer overruns and packet drops if routers have small -buffers or the link is experiencing congestion. -.It Va newcwd -Enable the New Congestion Window Validation mechanism as described in RFC 7661. -This gently reduces the congestion window during periods, where TCP is -application limited and the network bandwidth is not utilized completely. -That prevents self-inflicted packet losses once the application starts to -transmit data at a higher speed. +.It Va delayed_ack +Delay ACK to try and piggyback it onto a data packet. .It Va do_lrd Enable Lost Retransmission Detection for SACK-enabled sessions, disabled by default. @@ -617,76 +459,10 @@ mode, sending only one new packet for each ACK received. Helpful when a misconfigured token bucket traffic policer causes persistent high losses leading to RTO, but reduces PRR effectiveness in more common settings (default is false). -.It Va rfc6675_pipe -Deprecated and superseded by -.Va sack.revised -.It Va rfc3042 -Enable the Limited Transmit algorithm as described in RFC 3042. -It helps avoid timeouts on lossy links and also when the congestion window -is small, as happens on short transfers. -.It Va rfc3390 -Enable support for RFC 3390, which allows for a variable-sized -starting congestion window on new connections, depending on the -maximum segment size. -This helps throughput in general, but -particularly affects short transfers and high-bandwidth large -propagation-delay connections. -.It Va sack.enable -Enable support for RFC 2018, TCP Selective Acknowledgment option, -which allows the receiver to inform the sender about all successfully -arrived segments, allowing the sender to retransmit the missing segments -only. -.It Va sack.revised -Enables three updated mechanisms from RFC6675 (default is true). -Calculate the bytes in flight using the algorithm described in RFC 6675, and -is also an improvement when Proportional Rate Reduction is enabled. -Next, Rescue Retransmission helps timely loss recovery, when the trailing segments -of a transmission are lost, while no additional data is ready to be sent. -In case a partial ACK without a SACK block is received during SACK loss -recovery, the trailing segment is immediately resent, rather than waiting -for a Retransmission timeout. -Finally, SACK loss recovery is also engaged, once two segments plus one byte are -SACKed - even if no traditional duplicate ACKs were observed. -.It Va sack.maxholes -Maximum number of SACK holes per connection. -Defaults to 128. -.It Va sack.globalmaxholes -Maximum number of SACK holes per system, across all connections. -Defaults to 65536. -.It Va maxtcptw -When a TCP connection enters the -.Dv TIME_WAIT -state, its associated socket structure is freed, since it is of -negligible size and use, and a new structure is allocated to contain a -minimal amount of information necessary for sustaining a connection in -this state, called the compressed TCP TIME_WAIT state. -Since this structure is smaller than a socket structure, it can save -a significant amount of system memory. -The -.Va net.inet.tcp.maxtcptw -MIB variable controls the maximum number of these structures allocated. -By default, it is initialized to -.Va kern.ipc.maxsockets -/ 5. -.It Va nolocaltimewait -Suppress creating of compressed TCP TIME_WAIT states for connections in -which both endpoints are local. -.It Va fast_finwait2_recycle -Recycle -.Tn TCP -.Dv FIN_WAIT_2 -connections faster when the socket is marked as -.Dv SBS_CANTRCVMORE -(no user process has the socket open, data received on -the socket cannot be read). -The timeout used here is -.Va finwait2_timeout . -.It Va finwait2_timeout -Timeout to use for fast recycling of +.It Va do_tcpdrain +Flush packets in the .Tn TCP -.Dv FIN_WAIT_2 -connections. -Defaults to 60 seconds. +reassembly queue if the system is low on mbufs. .It Va ecn.enable Enable support for TCP Explicit Congestion Notification (ECN). ECN allows a TCP sender to reduce the transmission rate in order to @@ -707,40 +483,20 @@ Number of retries (SYN or SYN/ACK retransmits) before disabling ECN on a specific connection. This is needed to help with connection establishment when a broken firewall is in the network path. -.It Va pmtud_blackhole_detection -Enable automatic path MTU blackhole detection. -In case of retransmits of MSS sized segments, -the OS will lower the MSS to check if it's an MTU problem. -If the current MSS is greater than the configured value to try -.Po Va net.inet.tcp.pmtud_blackhole_mss -and -.Va net.inet.tcp.v6pmtud_blackhole_mss -.Pc , -it will be set to this value, otherwise, -the MSS will be set to the default values -.Po Va net.inet.tcp.mssdflt -and -.Va net.inet.tcp.v6mssdflt -.Pc . -Settings: -.Bl -tag -compact -.It 0 -Disable path MTU blackhole detection. -.It 1 -Enable path MTU blackhole detection for IPv4 and IPv6. -.It 2 -Enable path MTU blackhole detection only for IPv4. -.It 3 -Enable path MTU blackhole detection only for IPv6. -.El -.It Va pmtud_blackhole_mss -MSS to try for IPv4 if PMTU blackhole detection is turned on. -.It Va v6pmtud_blackhole_mss -MSS to try for IPv6 if PMTU blackhole detection is turned on. -.It Va fastopen.acceptany -When non-zero, all client-supplied TFO cookies will be considered to be valid. -The default is 0. -.It Va fastopen.autokey +.It Va fast_finwait2_recycle +Recycle +.Tn TCP +.Dv FIN_WAIT_2 +connections faster when the socket is marked as +.Dv SBS_CANTRCVMORE +(no user process has the socket open, data received on +the socket cannot be read). +The timeout used here is +.Va finwait2_timeout . +.It Va fastopen.acceptany +When non-zero, all client-supplied TFO cookies will be considered to be valid. +The default is 0. +.It Va fastopen.autokey When this and .Va net.inet.tcp.fastopen.server_enable are non-zero, a new key will be automatically generated after this specified @@ -823,75 +579,182 @@ bytes to this sysctl. Install a new pre-shared key by writing .Va net.inet.tcp.fastopen.keylen bytes to this sysctl. -.It Va hostcache.enable +.It Va finwait2_timeout +Timeout to use for fast recycling of +.Tn TCP +.Dv FIN_WAIT_2 +connections +.Pq Va fast_finwait2_recycle . +Defaults to 60 seconds. +.It Va functions_available +List of available TCP function blocks (TCP stacks). +.It Va functions_default +The default TCP function block (TCP stack). +.It Va functions_inherit_listen_socket_stack +Determines whether to inherit listen socket's TCP stack or use the current +system default TCP stack, as defined by +.Va functions_default . +Default is true. +.It Va hostcache The TCP host cache is used to cache connection details and metrics to improve future performance of connections between the same hosts. At the completion of a TCP connection, a host will cache information for the connection for some defined period of time. +There are a number of +.Va hostcache +variables under this node. +See +.Va hostcache.enable . +.It Va hostcache.bucketlimit +The maximum number of entries for the same hash. +Defaults to 30. +.It Va hostcache.cachelimit +Overall entry limit for hostcache. +Defaults to +.Va hashsize +* +.Va bucketlimit . +.It Va hostcache.count +The current number of entries in the host cache. +.It Va hostcache.enable +Enable/disable the host cache: .Bl -tag -compact .It 0 Disable the host cache. .It 1 Enable the host cache. (default) .El -.It Va hostcache.purgenow -Immediately purge all entries once set to any value. -Setting this to 2 will also reseed the hash salt. -.It Va hostcache.purge -Expire all entires on next pruning of host cache entries. -Any non-zero setting will be reset to zero, once the pruge -is running. -.Bl -tag -compact -.It 0 -Do not purge all entries when pruning the host cache. (default) -.It 1 -Purge all entries when doing the next pruning. -.It 2 -Purge all entries, and also reseed the hash salt. -.El -.It Va hostcache.prune -Time in seconds between pruning expired host cache entries. -Defaults to 300 (5 minutes). .It Va hostcache.expire Time in seconds, how long a entry should be kept in the host cache since last accessed. Defaults to 3600 (1 hour). -.It Va hostcache.count -The current number of entries in the host cache. -.It Va hostcache.bucketlimit -The maximum number of entries for the same hash. -Defaults to 30. .It Va hostcache.hashsize Size of TCP hostcache hashtable. This number has to be a power of two, or will be rejected. Defaults to 512. -.It Va hostcache.cachelimit -Overall entry limit for hostcache. -Defaults to hashsize * bucketlimit. .It Va hostcache.histo Provide a Histogram of the hostcache hash utilization. .It Va hostcache.list Provide a complete list of all current entries in the host cache. -.It Va functions_available -List of available TCP function blocks (TCP stacks). -.It Va functions_default -The default TCP function block (TCP stack). -.It Va functions_inherit_listen_socket_stack -Determines whether to inherit listen socket's tcp stack or use the current -system default tcp stack, as defined by -.Va functions_default . -Default is true. +.It Va hostcache.prune +Time in seconds between pruning expired host cache entries. +Defaults to 300 (5 minutes). +.It Va hostcache.purge +Expire all entires on next pruning of host cache entries. +Any non-zero setting will be reset to zero, once the purge +is running. +.Bl -tag -compact +.It 0 +Do not purge all entries when pruning the host cache (default). +.It 1 +Purge all entries when doing the next pruning. +.It 2 +Purge all entries and also reseed the hash salt. +.El +.It Va hostcache.purgenow +Immediately purge all entries once set to any value. +Setting this to 2 will also reseed the hash salt. +.It Va icmp_may_rst +Certain +.Tn ICMP +unreachable messages may abort connections in +.Tn SYN-SENT +state. +.It Va initcwnd_segments +Enable the ability to specify initial congestion window in number of segments. +The default value is 10 as suggested by RFC 6928. +Changing the value on the fly would not affect connections +using congestion window from the hostcache. +Caution: +This regulates the burst of packets allowed to be sent in the first RTT. +The value should be relative to the link capacity. +Start with small values for lower-capacity links. +Large bursts can cause buffer overruns and packet drops if routers have small +buffers or the link is experiencing congestion. .It Va insecure_rst Use criteria defined in RFC793 instead of RFC5961 for accepting RST segments. Default is false. .It Va insecure_syn Use criteria defined in RFC793 instead of RFC5961 for accepting SYN segments. Default is false. -.It Va ts_offset_per_conn -When initializing the TCP timestamps, use a per connection offset instead of a -per host pair offset. -Default is to use per connection offsets as recommended in RFC 7323. +.It Va isn_reseed_interval +The interval (in seconds) specifying how often the secret data used in +RFC 1948 initial sequence number calculations should be reseeded. +By default, this variable is set to zero, indicating that +no reseeding will occur. +Reseeding should not be necessary, and will break +.Dv TIME_WAIT +recycling for a few minutes. +.It Va keepcnt +Number of keepalive probes sent, with no response, before a connection +is dropped. +The default is 8 packets. +.It Va keepidle +Amount of time, in milliseconds, that the connection must be idle +before sending keepalive probes (if enabled). +The default is 7200000 msec (7.2M msec, 2 hours). +.It Va keepinit +Timeout, in milliseconds, for new, non-established +.Tn TCP +connections. +The default is 75000 msec (75K msec, 75 sec). +.It Va keepintvl +The interval, in milliseconds, between keepalive probes sent to remote +machines, when no response is received on a +.Va keepidle +probe. +The default is 75000 msec (75K msec, 75 sec). +.It Va log_in_vain +Log any connection attempts to ports where there is not a socket +accepting connections. +The value of 1 limits the logging to +.Tn SYN +(connection establishment) packets only. +A value of 2 results in any +.Tn TCP +packets to closed ports being logged. +Any value not listed above disables the logging +(default is 0, i.e., the logging is disabled). +.It Va maxtcptw +When a TCP connection enters the +.Dv TIME_WAIT +state, its associated socket structure is freed, since it is of +negligible size and use, and a new structure is allocated to contain a +minimal amount of information necessary for sustaining a connection in +this state, called the compressed TCP +.Dv TIME_WAIT +state. +Since this structure is smaller than a socket structure, it can save +a significant amount of system memory. +The +.Va net.inet.tcp.maxtcptw +MIB variable controls the maximum number of these structures allocated. +By default, it is initialized to +.Va kern.ipc.maxsockets +/ 5. +.It Va msl +The Maximum Segment Lifetime, in milliseconds, for a packet. +.It Va mssdflt +The default value used for the maximum segment size +.Pq Dq MSS +when no advice to the contrary is received from MSS negotiation. +.It Va newcwd +Enable the New Congestion Window Validation mechanism as described in RFC 7661. +This gently reduces the congestion window during periods, where TCP is +application limited and the network bandwidth is not utilized completely. +That prevents self-inflicted packet losses once the application starts to +transmit data at a higher speed. +.It Va nolocaltimewait +Suppress creation of compressed TCP +.Dv TIME_WAIT +states for connections in +which both endpoints are local. +.It Va path_mtu_discovery +Enable Path MTU Discovery. +.It Va pcbcount +Number of active process control blocks +(read-only). .It Va perconn_stats_enable Controls the default collection of statistics for all connections using the .Xr stats 3 @@ -903,16 +766,170 @@ A CSV list of template_spec=percent key-value pairs which controls the per template sampling rates when .Xr stats 3 sampling is enabled. -.It Va udp_tunneling_port -The local UDP encapsulation port. -A value of 0 indicates that UDP encapsulation is disabled. -The default is 0. +.It Va pmtud_blackhole_detection +Enable automatic path MTU blackhole detection. +In case of retransmits of MSS sized segments, +the OS will lower the MSS to check if it's an MTU problem. +If the current MSS is greater than the configured value to try +.Po Va net.inet.tcp.pmtud_blackhole_mss +and +.Va net.inet.tcp.v6pmtud_blackhole_mss +.Pc , +it will be set to this value, otherwise, +the MSS will be set to the default values +.Po Va net.inet.tcp.mssdflt +and +.Va net.inet.tcp.v6mssdflt +.Pc . +Settings: +.Bl -tag -compact +.It 0 +Disable path MTU blackhole detection. +.It 1 +Enable path MTU blackhole detection for IPv4 and IPv6. +.It 2 +Enable path MTU blackhole detection only for IPv4. +.It 3 +Enable path MTU blackhole detection only for IPv6. +.El +.It Va pmtud_blackhole_mss +MSS to try for IPv4 if PMTU blackhole detection is turned on. +.It Va reass.cursegments +The current total number of segments present in all reassembly queues. +.It Va reass.maxqueuelen +The maximum number of segments allowed in each reassembly queue. +By default, the system chooses a limit based on each TCP connection's +receive buffer size and maximum segment size (MSS). +The actual limit applied to a session's reassembly queue will be the lower of +the system-calculated automatic limit and the user-specified +.Va reass.maxqueuelen +limit. +.It Va reass.maxsegments +The maximum limit on the total number of segments across all reassembly +queues. +The limit can be adjusted as a tunable. +.It Va recvspace +Maximum +.Tn TCP +receive window. +.It Va rexmit_initial , rexmit_min , rexmit_slop +Adjust the retransmit timer calculation for +.Tn TCP . +The slop is +typically added to the raw calculation to take into account +occasional variances that the +.Tn SRTT +(smoothed round-trip time) +is unable to accommodate, while the minimum specifies an +absolute minimum. +While a number of +.Tn TCP +RFCs suggest a 1 +second minimum, these RFCs tend to focus on streaming behavior, +and fail to deal with the fact that a 1 second minimum has severe +detrimental effects over lossy interactive connections, such +as a 802.11b wireless link, and over very fast but lossy +connections for those cases not covered by the fast retransmit +code. +For this reason, we use 200ms of slop and a near-0 +minimum, which gives us an effective minimum of 200ms (similar to +.Tn Linux ) . +The initial value is used before an RTT measurement has been performed. +.It Va rfc1323 +Implement the window scaling and timestamp options of RFC 1323/RFC 7323 +(default is true). +.It Va rfc3042 +Enable the Limited Transmit algorithm as described in RFC 3042. +It helps avoid timeouts on lossy links and also when the congestion window +is small, as happens on short transfers. +.It Va rfc3390 +Enable support for RFC 3390, which allows for a variable-sized +starting congestion window on new connections, depending on the +maximum segment size. +This helps throughput in general, but +particularly affects short transfers and high-bandwidth large +propagation-delay connections. +.It Va rfc6675_pipe +Deprecated and superseded by +.Va sack.revised +.It Va sack.enable +Enable support for RFC 2018, TCP Selective Acknowledgment option, +which allows the receiver to inform the sender about all successfully +arrived segments, allowing the sender to retransmit the missing segments +only. +.It Va sack.globalmaxholes +Maximum number of SACK holes per system, across all connections. +Defaults to 65536. +.It Va sack.maxholes +Maximum number of SACK holes per connection. +Defaults to 128. +.It Va sack.revised +Enables three updated mechanisms from RFC6675 (default is true). +Calculate the bytes in flight using the algorithm described in RFC 6675, and +is also an improvement when Proportional Rate Reduction is enabled. +Next, Rescue Retransmission helps timely loss recovery, when the trailing segments +of a transmission are lost, while no additional data is ready to be sent. +In case a partial ACK without a SACK block is received during SACK loss +recovery, the trailing segment is immediately resent, rather than waiting +for a Retransmission timeout. +Finally, SACK loss recovery is also engaged, once two segments plus one byte are +SACKed - even if no traditional duplicate ACKs were observed. +.It Va sendspace +Maximum +.Tn TCP +send window. +.It Va syncookies +Determines whether or not +.Tn SYN +cookies should be generated for outbound +.Tn SYN-ACK +packets. +.Tn SYN +cookies are a great help during +.Tn SYN +flood attacks, and are enabled by default. +(See +.Xr syncookies 4 . ) +.It Va tcbhashsize +Size of the +.Tn TCP +control-block hash table +(read-only). +This is tuned using the kernel option +.Dv TCBHASHSIZE +or by setting +.Va net.inet.tcp.tcbhashsize +in the +.Xr loader 8 . +.It Va tolerate_missing_ts +Tolerate the missing of timestamps (RFC 1323/RFC 7323) for +.Tn TCP +segments belonging to +.Tn TCP +connections for which support of +.Tn TCP +timestamps has been negotiated. +As of June 2021, several TCP stacks are known to violate RFC 7323, including +modern widely deployed ones. +Therefore the default is 1, i.e., the missing of timestamps is tolerated. +.It Va ts_offset_per_conn +When initializing the TCP timestamps, use a per connection offset instead of a +per host pair offset. +Default is to use per connection offsets as recommended in RFC 7323. .It Va udp_tunneling_overhead The overhead taken into account when using UDP encapsulation. Since MSS clamping by middleboxes will most likely not work, values larger than 8 (the size of the UDP header) are also supported. Supported values are between 8 and 1024. The default is 8. +.It Va udp_tunneling_port +The local UDP encapsulation port. +A value of 0 indicates that UDP encapsulation is disabled. +The default is 0. +.It Va v6pmtud_blackhole_mss +MSS to try for IPv6 if PMTU blackhole detection is turned on. +See +.Va pmtud_blackhole_detection . .El .Sh ERRORS A socket operation may fail with one of the following errors returned: