From nobody Mon Feb 19 08:07:38 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TdZs71HQyz5B2x2; Mon, 19 Feb 2024 08:07:39 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TdZs70NHjz4Jt9; Mon, 19 Feb 2024 08:07:39 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1708330059; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=fOdjIrA5jZdNDg9iHC4U8/E6Xnfj6s+y7n7qkNB4Ut0=; b=A18LGjTgm+EkUY9YxfhntLGxzMwOQkLLcSsJ2EnnnucV6LQOL7ODunmjtqC9efaFnc6iYX SfI/AnWrLTxUP+c+FpyWW9KtVWIgGQDtXZbph+bwgq7OpwhjGUhxOm5UrN0or5eauC7QRZ SrQCh3qZ7RVcjgX3rX0aiP2kvKeTH1lsueSKeq1zcCSGd3rtHJyhtWOXR1koVQVOWcAzJY NzmaeKnRVHzk0xJmWHqwSaGqjgCnmsDNFrVc0Ccp8gKEjoVc0Q6PR+w2XAdKGnfWqtUNcG LlO4hh9MsCsbBm/YEYr1IIxPxJTvf//aRStBQpmAcNFzE/NQTeDSLFsn7C9QCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1708330059; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=fOdjIrA5jZdNDg9iHC4U8/E6Xnfj6s+y7n7qkNB4Ut0=; b=wKPsmyaAJi12BYS0VMOKCWsuZs/ATRCJ90jAo5W2d6hYtFfXdEphm4wy5XLxSPVJ3Dfvou T5bDDT5JHqQzqgA0cmKD3C6fjIRK3Mays7gLbwD9FJWRcayZ1SLxshQ76vn5VUJpLuH1ov 6bG5/WG+TckToW9ny9rfgPBgOFVYwozMpYDKo2ZpFNfMW64ykqFi5/7unWh64DwC9NuPoL joYAsT91/gRTx8JEPOBf/BariUze1Mx0vbXBr3bgaLx3Urr79uFmJgrpeMsISN14LsUrSC 4V0gArlhLRxJovWqNWASPkaou0ixG+kLliLjwN2+e+Vpcrrww5n5Gh2E4mUPeA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1708330059; a=rsa-sha256; cv=none; b=kStfqHCBU6kR6iZksMa7PeDjO72rsM9cRkokrbOTPfsOkvUoPKZX8cPXO2d9RoOJv/cyc8 wTwa9dpkxbFEWsLv4d0uJ0KkKUIjPjFJMq3EtsfpFC4xqbyPPp/QFogFSn0ZyPAyeA53Fo /HvDTFhPHJ8IwE4YmoCQrJQGcdBfpnY8+dX3yrdJtnhXijv7acXxxmKgTmlLtmIu4NHM9A 4vzBx0uXudQSk9T9PH5qRQnc+tvaeBAvUHqdgkvraB8/Do6hTSjUF6lTpwCYxlT0coN79F /jOUX1rWksmxjjAvdfEm3suAnofqlbL/uIWL7spyZkF6gm6n9ouLTVjbjuk22w== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4TdZs66LCzzR82; Mon, 19 Feb 2024 08:07:38 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 41J87c0a015380; Mon, 19 Feb 2024 08:07:38 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 41J87c7l015377; Mon, 19 Feb 2024 08:07:38 GMT (envelope-from git) Date: Mon, 19 Feb 2024 08:07:38 GMT Message-Id: <202402190807.41J87c7l015377@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: "Bjoern A. Zeeb" Subject: git: 90aaf46d5208 - stable/13 - LinuxKPI: reduce impact of large MAXCPU List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: bz X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 90aaf46d520816e7a92d88fc159fe8694a5e1e32 Auto-Submitted: auto-generated The branch stable/13 has been updated by bz: URL: https://cgit.FreeBSD.org/src/commit/?id=90aaf46d520816e7a92d88fc159fe8694a5e1e32 commit 90aaf46d520816e7a92d88fc159fe8694a5e1e32 Author: Bjoern A. Zeeb AuthorDate: 2023-10-23 23:14:35 +0000 Commit: Bjoern A. Zeeb CommitDate: 2024-02-19 08:01:58 +0000 LinuxKPI: reduce impact of large MAXCPU Start scaling arrays dynamically instead of using MAXCPU, resulting in extra allocations on startup but reducing the overall memory footprint. For the static single CPU mask we provide two versions to further save memory depending on a low or high CPU count system. The threshold to switch is currently at 128 CPUs on 64bit platforms. More detailed comments on the implementations can be found in the code. If I am not wrong on a MAXCPU=65536 system the memory footprint should roughly go down from 512M to 1.5M for the static single CPU mask. Submitted by: olce (most of this final version) Sponsored by: The FreeBSD Foundation PR: 274316 Differential Revision: https://reviews.freebsd.org/D42345 (cherry picked from commit 488e8a7faca51a71987fbf00cd36cfcd19269db7) --- sys/compat/linuxkpi/common/include/asm/processor.h | 2 +- sys/compat/linuxkpi/common/src/linux_compat.c | 106 +++++++++++++++++++-- 2 files changed, 99 insertions(+), 9 deletions(-) diff --git a/sys/compat/linuxkpi/common/include/asm/processor.h b/sys/compat/linuxkpi/common/include/asm/processor.h index 9e784396c63a..c55238d33505 100644 --- a/sys/compat/linuxkpi/common/include/asm/processor.h +++ b/sys/compat/linuxkpi/common/include/asm/processor.h @@ -41,7 +41,7 @@ struct cpuinfo_x86 { }; extern struct cpuinfo_x86 boot_cpu_data; -extern struct cpuinfo_x86 __cpu_data[]; +extern struct cpuinfo_x86 *__cpu_data; #define cpu_data(cpu) __cpu_data[cpu] #endif diff --git a/sys/compat/linuxkpi/common/src/linux_compat.c b/sys/compat/linuxkpi/common/src/linux_compat.c index 85fb072b9943..d3804e9ecf05 100644 --- a/sys/compat/linuxkpi/common/src/linux_compat.c +++ b/sys/compat/linuxkpi/common/src/linux_compat.c @@ -131,7 +131,8 @@ static void linux_cdev_deref(struct linux_cdev *ldev); static struct vm_area_struct *linux_cdev_handle_find(void *handle); cpumask_t cpu_online_mask; -static cpumask_t static_single_cpu_mask[MAXCPU]; +static cpumask_t **static_single_cpu_mask; +static cpumask_t *static_single_cpu_mask_lcs; struct kobject linux_class_root; struct device linux_root_device; struct class linux_class_misc; @@ -2775,17 +2776,19 @@ io_mapping_create_wc(resource_size_t base, unsigned long size) #if defined(__i386__) || defined(__amd64__) bool linux_cpu_has_clflush; struct cpuinfo_x86 boot_cpu_data; -struct cpuinfo_x86 __cpu_data[MAXCPU]; +struct cpuinfo_x86 *__cpu_data; #endif cpumask_t * lkpi_get_static_single_cpu_mask(int cpuid) { - KASSERT((cpuid >= 0 && cpuid < MAXCPU), ("%s: invalid cpuid %d\n", + KASSERT((cpuid >= 0 && cpuid <= mp_maxid), ("%s: invalid cpuid %d\n", + __func__, cpuid)); + KASSERT(!CPU_ABSENT(cpuid), ("%s: cpu with cpuid %d is absent\n", __func__, cpuid)); - return (&static_single_cpu_mask[cpuid]); + return (static_single_cpu_mask[cpuid]); } static void @@ -2801,7 +2804,9 @@ linux_compat_init(void *arg) boot_cpu_data.x86 = CPUID_TO_FAMILY(cpu_id); boot_cpu_data.x86_model = CPUID_TO_MODEL(cpu_id); - for (i = 0; i < MAXCPU; i++) { + __cpu_data = mallocarray(mp_maxid + 1, + sizeof(*__cpu_data), M_KMALLOC, M_WAITOK | M_ZERO); + CPU_FOREACH(i) { __cpu_data[i].x86_clflush_size = cpu_clflush_line_size; __cpu_data[i].x86_max_cores = mp_ncpus; __cpu_data[i].x86 = CPUID_TO_FAMILY(cpu_id); @@ -2836,13 +2841,92 @@ linux_compat_init(void *arg) CPU_COPY(&all_cpus, &cpu_online_mask); /* * Generate a single-CPU cpumask_t for each CPU (possibly) in the system. - * CPUs are indexed from 0..(MAXCPU-1). The entry for cpuid 0 will only + * CPUs are indexed from 0..(mp_maxid). The entry for cpuid 0 will only * have itself in the cpumask, cupid 1 only itself on entry 1, and so on. * This is used by cpumask_of() (and possibly others in the future) for, * e.g., drivers to pass hints to irq_set_affinity_hint(). */ - for (i = 0; i < MAXCPU; i++) - CPU_SET(i, &static_single_cpu_mask[i]); + static_single_cpu_mask = mallocarray(mp_maxid + 1, + sizeof(static_single_cpu_mask), M_KMALLOC, M_WAITOK | M_ZERO); + + /* + * When the number of CPUs reach a threshold, we start to save memory + * given the sets are static by overlapping those having their single + * bit set at same position in a bitset word. Asymptotically, this + * regular scheme is in O(n²) whereas the overlapping one is in O(n) + * only with n being the maximum number of CPUs, so the gain will become + * huge quite quickly. The threshold for 64-bit architectures is 128 + * CPUs. + */ + if (mp_ncpus < (2 * _BITSET_BITS)) { + cpumask_t *sscm_ptr; + + /* + * This represents 'mp_ncpus * __bitset_words(CPU_SETSIZE) * + * (_BITSET_BITS / 8)' bytes (for comparison with the + * overlapping scheme). + */ + static_single_cpu_mask_lcs = mallocarray(mp_ncpus, + sizeof(*static_single_cpu_mask_lcs), + M_KMALLOC, M_WAITOK | M_ZERO); + + sscm_ptr = static_single_cpu_mask_lcs; + CPU_FOREACH(i) { + static_single_cpu_mask[i] = sscm_ptr++; + CPU_SET(i, static_single_cpu_mask[i]); + } + } else { + /* Pointer to a bitset word. */ + __typeof(((cpuset_t *)NULL)->__bits[0]) *bwp; + + /* + * Allocate memory for (static) spans of 'cpumask_t' ('cpuset_t' + * really) with a single bit set that can be reused for all + * single CPU masks by making them start at different offsets. + * We need '__bitset_words(CPU_SETSIZE) - 1' bitset words before + * the word having its single bit set, and the same amount + * after. + */ + static_single_cpu_mask_lcs = mallocarray(_BITSET_BITS, + (2 * __bitset_words(CPU_SETSIZE) - 1) * (_BITSET_BITS / 8), + M_KMALLOC, M_WAITOK | M_ZERO); + + /* + * We rely below on cpuset_t and the bitset generic + * implementation assigning words in the '__bits' array in the + * same order of bits (i.e., little-endian ordering, not to be + * confused with machine endianness, which concerns bits in + * words and other integers). This is an imperfect test, but it + * will detect a change to big-endian ordering. + */ + _Static_assert( + __bitset_word(_BITSET_BITS + 1, _BITSET_BITS) == 1, + "Assumes a bitset implementation that is little-endian " + "on its words"); + + /* Initialize the single bit of each static span. */ + bwp = (__typeof(bwp))static_single_cpu_mask_lcs + + (__bitset_words(CPU_SETSIZE) - 1); + for (i = 0; i < _BITSET_BITS; i++) { + CPU_SET(i, (cpuset_t *)bwp); + bwp += (2 * __bitset_words(CPU_SETSIZE) - 1); + } + + /* + * Finally set all CPU masks to the proper word in their + * relevant span. + */ + CPU_FOREACH(i) { + bwp = (__typeof(bwp))static_single_cpu_mask_lcs; + /* Find the non-zero word of the relevant span. */ + bwp += (2 * __bitset_words(CPU_SETSIZE) - 1) * + (i % _BITSET_BITS) + + __bitset_words(CPU_SETSIZE) - 1; + /* Shift to find the CPU mask start. */ + bwp -= (i / _BITSET_BITS); + static_single_cpu_mask[i] = (cpuset_t *)bwp; + } + } strlcpy(init_uts_ns.name.release, osrelease, sizeof(init_uts_ns.name.release)); } @@ -2855,6 +2939,12 @@ linux_compat_uninit(void *arg) linux_kobject_kfree_name(&linux_root_device.kobj); linux_kobject_kfree_name(&linux_class_misc.kobj); + free(static_single_cpu_mask_lcs, M_KMALLOC); + free(static_single_cpu_mask, M_KMALLOC); +#if defined(__i386__) || defined(__amd64__) + free(__cpu_data, M_KMALLOC); +#endif + mtx_destroy(&vmmaplock); spin_lock_destroy(&pci_lock); rw_destroy(&linux_vma_lock);