From nobody Mon Aug 05 16:52:07 2024 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Wd2Xl57FLz5SLNV; Mon, 05 Aug 2024 16:52:07 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Wd2Xl4btPz4gYx; Mon, 5 Aug 2024 16:52:07 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1722876727; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=MKIf/vuxON+eIYpqSJ7lO/xCTb+u5TCTaJc/4Z+ejQU=; b=W1lS76o90DA0z0CDjZnDwoLRC2i77z+bwCJwWGlZnC2ZIIovCVoIcX3E3rJ1/YknNPBldk ypW9WPmtwUKZiom7H0sxwxjiXMT8zs86kfdubg39nIYRaOCfQlcRMNyWFiKfWKj77m7CG1 8TuvrzcNoIZ7ra163BOrySMGz6l0LIf09VgWGEat2u+FX9rpUHj18hq5jfmNmkD9G0u1jc CiuYb0xKsGxNP99GT6j4fyTOeyUAWvjp+zJggIt46boSWaaXB+bZddB8wSY9Y3n0aSDrFT vsXDZuQVWLbjTI/ZJ1nugELfMvDPebGgD3/m23XXD3BCe2pKTKvU6aS4OZvVdw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1722876727; a=rsa-sha256; cv=none; b=WkvDjMsqgTSpkAYfNaaoV85Squ6L794inG0zu+MBNBpw6oUkNsszWjFCAGJxFUxviAI9f5 OhL5KodTM6vWABmEyyR0YG2zdNJU1OcgaCBnI4bYsOS5sHkDgf1Obc0hmubD+8khvLaA38 yHXKktqW+3iIOTWP+csJuYpAN1ECyA8JfjAi5nhK3aTS4eRKzMVHYSFPdjbiTawvqWGEir gpToonhg0ewOpNpEMi614i3Gbvvo+33elad2xJSprjRlP894WjOXXWvQ21Zhu/V4FkMfB2 A/hmBU32gjblifeXHG8vpTK3zELCnGPnX/qio2NNBuIkZXrQ0x0JnXCfiMpCkg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1722876727; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=MKIf/vuxON+eIYpqSJ7lO/xCTb+u5TCTaJc/4Z+ejQU=; b=Pw8kBzCqIt6f9vFmwKP0R5EZONaqT8xg0J3itZjNGytBXH/kiE/bg3tmJZGUnc9HuO+2Xt ARwAC1tIvm6UlZEqyblIGHLLKNH8xJFtkRYPGf6u8B0BplJWTB+w6pOL1eTlIguewcdnZA /ACEOzlA0oxIEjlj8QBOaG21k8aiiGnNVLu2cEzHNnhIpkcZV3aHlaRUI7ONXfqgLQe5I8 LDqOJBGpetFbmYmd4JfxaRiGkpONAQ8u6BG4RuNNdOrK3YNvLVol9Pk73Gu3qE+l02mffl tLTwGG3ZpN+P74EWXWhANPdBz87042n+kO2BBkvqEEr7Q3yl/C859ae6G9bdzw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Wd2Xl4CMTzXdM; Mon, 5 Aug 2024 16:52:07 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 475Gq7Ts067141; Mon, 5 Aug 2024 16:52:07 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 475Gq7jA067138; Mon, 5 Aug 2024 16:52:07 GMT (envelope-from git) Date: Mon, 5 Aug 2024 16:52:07 GMT Message-Id: <202408051652.475Gq7jA067138@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Andrew Gallatin Subject: git: 1f628be888b7 - main - tcp_ratelimit: provide an api for drivers to release ratesets at detach List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: gallatin X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 1f628be888b74f1219b3ea7ccea1e7a3d1db77a2 Auto-Submitted: auto-generated The branch main has been updated by gallatin: URL: https://cgit.FreeBSD.org/src/commit/?id=1f628be888b74f1219b3ea7ccea1e7a3d1db77a2 commit 1f628be888b74f1219b3ea7ccea1e7a3d1db77a2 Author: Andrew Gallatin AuthorDate: 2024-08-05 15:45:42 +0000 Commit: Andrew Gallatin CommitDate: 2024-08-05 16:51:35 +0000 tcp_ratelimit: provide an api for drivers to release ratesets at detach When the kernel is compiled with options RATELIMIT, the mlx5en driver cannot detach. It gets stuck waiting for all kernel users of its rates to drop to zero before finally calling ether_ifdetach. The tcp ratelimit code has an eventhandler for ifnet departure which causes rates to be released. However, this is called as an ifnet departure eventhandler, which is invoked as part of ifdetach(), via either_ifdetach(). This means that the tcp ratelimit code holds down many hw rates when the mlx5en driver is waiting for the rate count to go to 0. Thus devctl detach will deadlock on mlx5 with this stack: mi_switch+0xcf sleepq_timedwait+0x2f _sleep+0x1a3 pause_sbt+0x77 mlx5e_destroy_ifp+0xaf mlx5_remove_device+0xa7 mlx5_unregister_device+0x78 mlx5_unload_one+0x10a remove_one+0x1e linux_pci_detach_device+0x36 linux_pci_detach+0x24 device_detach+0x180 devctl2_ioctl+0x3dc devfs_ioctl+0xbb vn_ioctl+0xca devfs_ioctl_f+0x1e kern_ioctl+0x1c3 sys_ioctl+0x10a To fix this, provide an explicit API for a driver to call the tcp ratelimit code telling it to detach itself from an ifnet. This allows the mlx5 driver to unload cleanly. I considered adding an ifnet pre-departure eventhandler. However, that would need to be invoked by the driver, so a simple function call seemed better. The mlx5en driver has been updated to call this function. Reviewed by: kib, rrs Differential Revision: https://reviews.freebsd.org/D46221 Sponsored by: Netflix --- sys/dev/mlx5/mlx5_en/mlx5_en_main.c | 8 +++++++- sys/netinet/tcp_ratelimit.c | 6 ++++++ sys/netinet/tcp_ratelimit.h | 9 +++++++++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/sys/dev/mlx5/mlx5_en/mlx5_en_main.c b/sys/dev/mlx5/mlx5_en/mlx5_en_main.c index ccbdf11a1dd5..a80235f0f347 100644 --- a/sys/dev/mlx5/mlx5_en/mlx5_en_main.c +++ b/sys/dev/mlx5/mlx5_en/mlx5_en_main.c @@ -36,6 +36,7 @@ #include #include +#include #include #include @@ -4876,7 +4877,12 @@ mlx5e_destroy_ifp(struct mlx5_core_dev *mdev, void *vpriv) #ifdef RATELIMIT /* - * The kernel can have reference(s) via the m_snd_tag's into + * Tell the TCP ratelimit code to release the rate-sets attached + * to our ifnet. + */ + tcp_rl_release_ifnet(ifp); + /* + * The kernel can still have reference(s) via the m_snd_tag's into * the ratelimit channels, and these must go away before * detaching: */ diff --git a/sys/netinet/tcp_ratelimit.c b/sys/netinet/tcp_ratelimit.c index 1834c702c493..22bdf707fa89 100644 --- a/sys/netinet/tcp_ratelimit.c +++ b/sys/netinet/tcp_ratelimit.c @@ -1298,6 +1298,12 @@ tcp_rl_ifnet_departure(void *arg __unused, struct ifnet *ifp) NET_EPOCH_EXIT(et); } +void +tcp_rl_release_ifnet(struct ifnet *ifp) +{ + tcp_rl_ifnet_departure(NULL, ifp); +} + static void tcp_rl_shutdown(void *arg __unused, int howto __unused) { diff --git a/sys/netinet/tcp_ratelimit.h b/sys/netinet/tcp_ratelimit.h index cd540d1164e1..0ce42dea0d90 100644 --- a/sys/netinet/tcp_ratelimit.h +++ b/sys/netinet/tcp_ratelimit.h @@ -94,6 +94,8 @@ CK_LIST_HEAD(head_tcp_rate_set, tcp_rate_set); #ifndef ETHERNET_SEGMENT_SIZE #define ETHERNET_SEGMENT_SIZE 1514 #endif +struct tcpcb; + #ifdef RATELIMIT #define DETAILED_RATELIMIT_SYSCTL 1 /* * Undefine this if you don't want @@ -131,6 +133,9 @@ tcp_get_pacing_burst_size_w_divisor(struct tcpcb *tp, uint64_t bw, uint32_t segs void tcp_rl_log_enobuf(const struct tcp_hwrate_limit_table *rte); +void +tcp_rl_release_ifnet(struct ifnet *ifp); + #else static inline const struct tcp_hwrate_limit_table * tcp_set_pacing_rate(struct tcpcb *tp, struct ifnet *ifp, @@ -218,6 +223,10 @@ tcp_rl_log_enobuf(const struct tcp_hwrate_limit_table *rte) { } +static inline void +tcp_rl_release_ifnet(struct ifnet *ifp) +{ +} #endif /*