From nobody Thu May 05 19:14:49 2022 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B73881ACA421; Thu, 5 May 2022 19:14:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KvNgF4fLtz4mmt; Thu, 5 May 2022 19:14:49 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1651778089; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=oLpHWWpSJrzFDY4Lrjgf2bFAlEh/Htq0/CyvYCWbu/g=; b=O+7DQOQtQut74xuNoxec7c/+24Dc1rVtdLWcx2dD7w+9tzHJEKc3gpc624gBw7Dwz6S2zC 3zo5UJum8pRLr1jeZX9aD6TUkVeunKc8hk6Db7Ss43DosXSQy6syyOE97Fr7Qjj2B58SU9 q9d5vWxlWS+RLhPujs2Fi6cFD7xKB+BX49tgILL5LuTRy5ui5Xjo6sxoVJwYa720k1EBvs Q/UKdwGdbL72gW0+niCTbtfPkPRFpz7qyFGKPIEBQTdH7mdl9DyYuGBbWXsgOeugak8UwS 3FTfEgX8DxqNVK1itU6v7/Wffmpl2n9kC2xrRo9mMQqhut0K0I1pRs+e81FttQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 801035245; Thu, 5 May 2022 19:14:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 245JEnJF090244; Thu, 5 May 2022 19:14:49 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 245JEnaC090243; Thu, 5 May 2022 19:14:49 GMT (envelope-from git) Date: Thu, 5 May 2022 19:14:49 GMT Message-Id: <202205051914.245JEnaC090243@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Ed Maste Subject: git: 80e60e236d85 - main - ifnet: make if_index global List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: emaste X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 80e60e236d85d13cd2a34973dd408477288efa38 Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1651778089; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=oLpHWWpSJrzFDY4Lrjgf2bFAlEh/Htq0/CyvYCWbu/g=; b=c1Wflqc4kwTmEF1EZ5FDIFB2Kx8qjWj1qz/TqkaMcap2XLbBlipd1sTJYcs/gnD3t2u+JY YwvtxTqjxcJphTNR29tRftTsg97bH4HYMIiI2rfZW3UBZ9plkutiu0fGhL2YyAL2d1q2He 0dNCky2CaIbdJ1k/UnJAvhFYBhH2el5skaQvvHGNDAMeh9lIFtl7VEATqjF0EKOhG0vPTN lrjiqyH4c6XS+H6prnxqx0JFejNkqWFZCcVDDEANRpOF7RgWhnrQ+gltqwVKyopjfjFa/e 3u7bQDZ+dhP71BQ7pQes8HKQC6/oN2Y6CTP5I9/NUNWAe/z/Hv5QDlYs3RTe8w== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1651778089; a=rsa-sha256; cv=none; b=QgPS5rbLLX53k2Qoo+i3Wz70WSlwEKslZj9tgCyuLLTRWh4SgKKJ08E4Wzl4CNuDO8kXrs kOAE7ba3zW5FySL91damdC3y66BDC4MpBXMmVXBhCM97MH0WzlOiDX3AFVoOLd3vl17lvC GJpO7H5sLlj4rZA97RIvo8gTrQMegbrFdJKsjq4UE7XynRFOSQWnq06fsrHLsqPf75lbUh 2CnBWGHwFbxRQulV2ISbeRMdB37b46A9xxjdCy7xElVH16qs6dJxau8RKq8PQLs85NER/y 1qeq1p4NH+swYfX9LPIOajz5vGn0O580eh3TLl13gT1YWW1Kl2sDixPcPXhAMA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by emaste: URL: https://cgit.FreeBSD.org/src/commit/?id=80e60e236d85d13cd2a34973dd408477288efa38 commit 80e60e236d85d13cd2a34973dd408477288efa38 Author: Gleb Smirnoff AuthorDate: 2022-01-27 05:58:44 +0000 Commit: Ed Maste CommitDate: 2022-05-05 18:38:07 +0000 ifnet: make if_index global Now that ifindex is static to if.c we can unvirtualize it. For lifetime of an ifnet its index never changes. To avoid leaking foreign interfaces the net.link.generic.system.ifcount sysctl and the ifnet_byindex() KPI filter their returned value on curvnet. Since if_vmove() no longer changes the if_index, inline ifindex_alloc() and ifindex_free() into if_alloc() and if_free() respectively. API wise the only change is that now minimum interface index can be greater than 1. The holes in interface indexes were always allowed. Reviewed by: kp Differential revision: https://reviews.freebsd.org/D33672 (cherry picked from commit 91f44749c6feb50f39af8805dd803e860f0418f1) --- sys/net/if.c | 214 ++++++++++++++++++++++++----------------------------------- 1 file changed, 85 insertions(+), 129 deletions(-) diff --git a/sys/net/if.c b/sys/net/if.c index de63b9366843..3b303fe42e99 100644 --- a/sys/net/if.c +++ b/sys/net/if.c @@ -311,19 +311,30 @@ VNET_DEFINE(struct ifnethead, ifnet); /* depend on static init XXX */ VNET_DEFINE(struct ifgrouphead, ifg_head); /* Table of ifnet by index. */ -VNET_DEFINE_STATIC(int, if_index); -#define V_if_index VNET(if_index) -VNET_DEFINE_STATIC(int, if_indexlim) = 8; -#define V_if_indexlim VNET(if_indexlim) -VNET_DEFINE_STATIC(struct ifnet **, ifindex_table); -#define V_ifindex_table VNET(ifindex_table) +static int if_index; +static int if_indexlim = 8; +static struct ifnet **ifindex_table; SYSCTL_NODE(_net_link_generic, IFMIB_SYSTEM, system, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, "Variables global to all interfaces"); -SYSCTL_INT(_net_link_generic_system, IFMIB_IFCOUNT, ifcount, - CTLFLAG_VNET | CTLFLAG_RD, &VNET_NAME(if_index), 0, - "Number of configured interfaces"); +static int +sysctl_ifcount(SYSCTL_HANDLER_ARGS) +{ + int rv = 0; + + IFNET_RLOCK(); + for (int i = 1; i <= if_index; i++) + if (ifindex_table[i] != NULL && + ifindex_table[i]->if_vnet == curvnet) + rv = i; + IFNET_RUNLOCK(); + + return (sysctl_handle_int(oidp, &rv, 0, req)); +} +SYSCTL_PROC(_net_link_generic_system, IFMIB_IFCOUNT, ifcount, + CTLTYPE_INT | CTLFLAG_VNET | CTLFLAG_RD, NULL, 0, sysctl_ifcount, "I", + "Maximum known interface index"); /* * The global network interface list (V_ifnet) and related state (such as @@ -352,13 +363,19 @@ MALLOC_DEFINE(M_IFMADDR, "ether_multi", "link-level multicast address"); struct ifnet * ifnet_byindex(u_int idx) { + struct ifnet *ifp; NET_EPOCH_ASSERT(); - if (__predict_false(idx > V_if_index)) + if (__predict_false(idx > if_index)) return (NULL); - return (ck_pr_load_ptr(&V_ifindex_table[idx])); + ifp = ck_pr_load_ptr(&ifindex_table[idx]); + + if (curvnet != NULL && ifp != NULL && ifp->if_vnet != curvnet) + ifp = NULL; + + return (ifp); } struct ifnet * @@ -375,63 +392,20 @@ ifnet_byindex_ref(u_int idx) } /* - * Allocate an ifindex array entry. + * Network interface utility routines. + * + * Routines with ifa_ifwith* names take sockaddr *'s as + * parameters. */ -static void -ifindex_alloc(struct ifnet *ifp) -{ - u_short idx; - - IFNET_WLOCK(); - /* - * Try to find an empty slot below V_if_index. If we fail, take the - * next slot. - */ - for (idx = 1; idx <= V_if_index; idx++) { - if (V_ifindex_table[idx] == NULL) - break; - } - - /* Catch if_index overflow. */ - if (idx >= V_if_indexlim) { - struct ifnet **new, **old; - int newlim; - - newlim = V_if_indexlim * 2; - new = malloc(newlim * sizeof(*new), M_IFNET, M_WAITOK | M_ZERO); - memcpy(new, V_ifindex_table, V_if_indexlim * sizeof(*new)); - old = V_ifindex_table; - ck_pr_store_ptr(&V_ifindex_table, new); - V_if_indexlim = newlim; - epoch_wait_preempt(net_epoch_preempt); - free(old, M_IFNET); - } - if (idx > V_if_index) - V_if_index = idx; - - ifp->if_index = idx; - ck_pr_store_ptr(&V_ifindex_table[idx], ifp); - IFNET_WUNLOCK(); -} static void -ifindex_free(u_short idx) +if_init(void *arg __unused) { - IFNET_WLOCK_ASSERT(); - - ck_pr_store_ptr(&V_ifindex_table[idx], NULL); - while (V_if_index > 0 && - V_ifindex_table[V_if_index] == NULL) - V_if_index--; + ifindex_table = malloc(if_indexlim * sizeof(*ifindex_table), + M_IFNET, M_WAITOK | M_ZERO); } - -/* - * Network interface utility routines. - * - * Routines with ifa_ifwith* names take sockaddr *'s as - * parameters. - */ +SYSINIT(if_init, SI_SUB_INIT_IF, SI_ORDER_SECOND, if_init, NULL); static void vnet_if_init(const void *unused __unused) @@ -439,29 +413,11 @@ vnet_if_init(const void *unused __unused) CK_STAILQ_INIT(&V_ifnet); CK_STAILQ_INIT(&V_ifg_head); - V_ifindex_table = malloc(V_if_indexlim * sizeof(*V_ifindex_table), - M_IFNET, M_WAITOK | M_ZERO); vnet_if_clone_init(); } VNET_SYSINIT(vnet_if_init, SI_SUB_INIT_IF, SI_ORDER_SECOND, vnet_if_init, NULL); -#ifdef VIMAGE -static void -vnet_if_uninit(const void *unused __unused) -{ - - VNET_ASSERT(CK_STAILQ_EMPTY(&V_ifnet), ("%s:%d tailq &V_ifnet=%p " - "not empty", __func__, __LINE__, &V_ifnet)); - VNET_ASSERT(CK_STAILQ_EMPTY(&V_ifg_head), ("%s:%d tailq &V_ifg_head=%p " - "not empty", __func__, __LINE__, &V_ifg_head)); - - free((caddr_t)V_ifindex_table, M_IFNET); -} -VNET_SYSUNINIT(vnet_if_uninit, SI_SUB_INIT_IF, SI_ORDER_FIRST, - vnet_if_uninit, NULL); -#endif - static void if_link_ifnet(struct ifnet *ifp) { @@ -554,6 +510,7 @@ static struct ifnet * if_alloc_domain(u_char type, int numa_domain) { struct ifnet *ifp; + u_short idx; KASSERT(numa_domain <= IF_NODOM, ("numa_domain too large")); if (numa_domain == IF_NODOM) @@ -593,7 +550,37 @@ if_alloc_domain(u_char type, int numa_domain) ifp->if_get_counter = if_get_counter_default; ifp->if_pcp = IFNET_PCP_NONE; - ifindex_alloc(ifp); + /* Allocate an ifindex array entry. */ + IFNET_WLOCK(); + /* + * Try to find an empty slot below if_index. If we fail, take the + * next slot. + */ + for (idx = 1; idx <= if_index; idx++) { + if (ifindex_table[idx] == NULL) + break; + } + + /* Catch if_index overflow. */ + if (idx >= if_indexlim) { + struct ifnet **new, **old; + int newlim; + + newlim = if_indexlim * 2; + new = malloc(newlim * sizeof(*new), M_IFNET, M_WAITOK | M_ZERO); + memcpy(new, ifindex_table, if_indexlim * sizeof(*new)); + old = ifindex_table; + ck_pr_store_ptr(&ifindex_table, new); + if_indexlim = newlim; + epoch_wait_preempt(net_epoch_preempt); + free(old, M_IFNET); + } + if (idx > if_index) + if_index = idx; + + ifp->if_index = idx; + ck_pr_store_ptr(&ifindex_table[idx], ifp); + IFNET_WUNLOCK(); return (ifp); } @@ -663,23 +650,18 @@ if_free(struct ifnet *ifp) * epoch and then dereferencing ifp while we perform if_free(), * and after if_free() finished, too. * - * The reason is the VIMAGE. For some reason it was designed - * to require all sockets drained before destroying, but not all - * ifnets. A vnet destruction calls if_vmove() on ifnet, which - * causes ID change. But ID change and a possible misidentification - * of an ifnet later is a lesser problem, as it doesn't crash kernel. - * A worse problem is that removed interface may outlive the vnet it - * belongs too! The if_free_deferred() would see ifp->if_vnet freed. + * This early index freeing was important back when ifindex was + * virtualized and interface would outlive the vnet. */ - CURVNET_SET_QUIET(ifp->if_vnet); IFNET_WLOCK(); - MPASS(V_ifindex_table[ifp->if_index] == ifp); - ifindex_free(ifp->if_index); + MPASS(ifindex_table[ifp->if_index] == ifp); + ck_pr_store_ptr(&ifindex_table[ifp->if_index], NULL); + while (if_index > 0 && ifindex_table[if_index] == NULL) + if_index--; IFNET_WUNLOCK(); if (refcount_release(&ifp->if_refcount)) NET_EPOCH_CALL(if_free_deferred, &ifp->if_epoch_ctx); - CURVNET_RESTORE(); } /* @@ -823,7 +805,7 @@ if_attach_internal(struct ifnet *ifp, bool vmove) struct sockaddr_dl *sdl; struct ifaddr *ifa; - MPASS(V_ifindex_table[ifp->if_index] == ifp); + MPASS(ifindex_table[ifp->if_index] == ifp); #ifdef VIMAGE ifp->if_vnet = curvnet; @@ -1273,17 +1255,6 @@ if_vmove(struct ifnet *ifp, struct vnet *new_vnet) if (rc != 0) return (rc); - /* - * Unlink the ifnet from ifindex_table[] in current vnet, and shrink - * the if_index for that vnet if possible. - * - * NOTE: IFNET_WLOCK/IFNET_WUNLOCK() are assumed to be unvirtualized, - * or we'd lock on one vnet and unlock on another. - */ - IFNET_WLOCK(); - ifindex_free(ifp->if_index); - IFNET_WUNLOCK(); - /* * Perform interface-specific reassignment tasks, if provided by * the driver. @@ -1295,7 +1266,6 @@ if_vmove(struct ifnet *ifp, struct vnet *new_vnet) * Switch to the context of the target vnet. */ CURVNET_SET_QUIET(new_vnet); - ifindex_alloc(ifp); if_attach_internal(ifp, true); #ifdef DEV_BPF @@ -1931,7 +1901,6 @@ ifa_ifwithnet(const struct sockaddr *addr, int ignore_ptp, int fibnum) struct ifaddr *ifa_maybe = NULL; u_int af = addr->sa_family; const char *addr_data = addr->sa_data, *cplim; - const struct sockaddr_dl *sdl; NET_EPOCH_ASSERT(); /* @@ -1939,14 +1908,9 @@ ifa_ifwithnet(const struct sockaddr *addr, int ignore_ptp, int fibnum) * so do that if we can. */ if (af == AF_LINK) { - sdl = (const struct sockaddr_dl *)addr; - if (sdl->sdl_index && sdl->sdl_index <= V_if_index) { - ifp = ifnet_byindex(sdl->sdl_index); - if (ifp == NULL) - return (NULL); - - return (ifp->if_addr); - } + ifp = ifnet_byindex( + ((const struct sockaddr_dl *)addr)->sdl_index); + return (ifp ? ifp->if_addr : NULL); } /* @@ -4582,24 +4546,16 @@ DB_SHOW_COMMAND(ifnet, db_show_ifnet) DB_SHOW_ALL_COMMAND(ifnets, db_show_all_ifnets) { - VNET_ITERATOR_DECL(vnet_iter); struct ifnet *ifp; u_short idx; - VNET_FOREACH(vnet_iter) { - CURVNET_SET_QUIET(vnet_iter); -#ifdef VIMAGE - db_printf("vnet=%p\n", curvnet); -#endif - for (idx = 1; idx <= V_if_index; idx++) { - ifp = V_ifindex_table[idx]; - if (ifp == NULL) - continue; - db_printf( "%20s ifp=%p\n", ifp->if_xname, ifp); - if (db_pager_quit) - break; - } - CURVNET_RESTORE(); + for (idx = 1; idx <= if_index; idx++) { + ifp = ifindex_table[idx]; + if (ifp == NULL) + continue; + db_printf( "%20s ifp=%p\n", ifp->if_xname, ifp); + if (db_pager_quit) + break; } } #endif /* DDB */