From nobody Sun Feb 18 21:11:02 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TdJHV6hZrz5BlnK; Sun, 18 Feb 2024 21:11:02 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TdJHV3wrjz41w8; Sun, 18 Feb 2024 21:11:02 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1708290662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=GO5rl3eTENOLPOHSHZfBZTWKT1q9ljejACjzb9jp5Tk=; b=CjF1wkfK689U+J2z7quoZ7AE0OLt2jPs7L1aScBS133xwQuuGPllArmCq6/TFrW0j31qt9 VLc4W1A1fykOH3VTn6Vb2r1KdxiX5OqNcEUQxiB7iQaPlFkCF1zCtM5zWIdxHhxeuNFV52 +Z81+mPnGmpmz0E89TahmBVQAvFXsAbluaPbu+6jhVWcZqkYbuFbwGAsD5auzIY4DVD0jV iu4AQLRLfz/26IEMCG09JskEXTEhhiUK3uXoixJR3GoUQ53wCYl+zkmfuCVs8eU/Z6l8Bi ntp47IOMZUDgO9ZphQ0zOKrMajIzrIhrsEeroUuTLPTXG37EZk+FEhsRwr0sWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1708290662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=GO5rl3eTENOLPOHSHZfBZTWKT1q9ljejACjzb9jp5Tk=; b=kZuh6zHn9K4UeCS/0BXZLRNgrPEIDoHGOWyEqdPr2PgANdqxm7K6kOsmp0QLfj3Y0JmOM9 5yRxzSlFD7CEzxNCHu1++Mp5GFT5bm9tKMyncyCi+GTmQHoI/tjBybqhMfmQSl/NO8bpnh NFY0kp0K2uvDGtowXwBChRav8FeKvz1blQWq40On43MF4GjZcpUTo4FNRxQPR2QdcJoIrF Z5f8n7vtcbio38hppNM0X0a6dgGiu7iQXWEEKODp1Qf6CgaLehae3JvakY3eXDxWNXFP4M 2xHxjNTJ1AqFJhbgTyxODeinBrsObr9z+6l5Xey4tD0UddWRCjKBOxqSebnfVg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1708290662; a=rsa-sha256; cv=none; b=deOA7vcGs9+WlkejbM1oxZz6ZojRtJ1bAkEccvUC5ytiXrbQbbWz9V1mBRXfZ8yPaoeSIs SaHb2kPSLXrZbsdAkH2u19/j2qDM8TBhgeugHRIUfCyujJCM/rffEx6cADus99MZ68UDUF oAgT3ZX+P97j/ZoUuvZul0YIYJxcWJOFdVMOKOU0HtUcizORE/w9lyM5ARWApwvt4el5B5 IzjR1x7XG9l4SIgTPutItJQiJa1AGkKpT2Roge/gf/gqnXXIrIrvaY7uS+wO6ckFBuxFTH MsPRa7UhdImQgueEDyCvB+/SGgD4IhYFf4YkUJRf7+OVbuT7LpYkN4Rji2C/ag== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4TdJHV31pvz16Dd; Sun, 18 Feb 2024 21:11:02 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 41ILB2o8012795; Sun, 18 Feb 2024 21:11:02 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 41ILB2mv012792; Sun, 18 Feb 2024 21:11:02 GMT (envelope-from git) Date: Sun, 18 Feb 2024 21:11:02 GMT Message-Id: <202402182111.41ILB2mv012792@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: "Bjoern A. Zeeb" Subject: git: 7730aec6b7c8 - stable/14 - LinuxKPI: reduce impact of large MAXCPU List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: bz X-Git-Repository: src X-Git-Refname: refs/heads/stable/14 X-Git-Reftype: branch X-Git-Commit: 7730aec6b7c8ac6c6e4ca31577b8af0c15ebb3ec Auto-Submitted: auto-generated The branch stable/14 has been updated by bz: URL: https://cgit.FreeBSD.org/src/commit/?id=7730aec6b7c8ac6c6e4ca31577b8af0c15ebb3ec commit 7730aec6b7c8ac6c6e4ca31577b8af0c15ebb3ec Author: Bjoern A. Zeeb AuthorDate: 2023-10-23 23:14:35 +0000 Commit: Bjoern A. Zeeb CommitDate: 2024-02-18 16:41:24 +0000 LinuxKPI: reduce impact of large MAXCPU Start scaling arrays dynamically instead of using MAXCPU, resulting in extra allocations on startup but reducing the overall memory footprint. For the static single CPU mask we provide two versions to further save memory depending on a low or high CPU count system. The threshold to switch is currently at 128 CPUs on 64bit platforms. More detailed comments on the implementations can be found in the code. If I am not wrong on a MAXCPU=65536 system the memory footprint should roughly go down from 512M to 1.5M for the static single CPU mask. Submitted by: olce (most of this final version) Sponsored by: The FreeBSD Foundation PR: 274316 Differential Revision: https://reviews.freebsd.org/D42345 (cherry picked from commit 488e8a7faca51a71987fbf00cd36cfcd19269db7) --- sys/compat/linuxkpi/common/include/asm/processor.h | 2 +- sys/compat/linuxkpi/common/src/linux_compat.c | 106 +++++++++++++++++++-- 2 files changed, 99 insertions(+), 9 deletions(-) diff --git a/sys/compat/linuxkpi/common/include/asm/processor.h b/sys/compat/linuxkpi/common/include/asm/processor.h index 1165702e0652..2bc4b6532544 100644 --- a/sys/compat/linuxkpi/common/include/asm/processor.h +++ b/sys/compat/linuxkpi/common/include/asm/processor.h @@ -54,7 +54,7 @@ struct cpuinfo_x86 { }; extern struct cpuinfo_x86 boot_cpu_data; -extern struct cpuinfo_x86 __cpu_data[]; +extern struct cpuinfo_x86 *__cpu_data; #define cpu_data(cpu) __cpu_data[cpu] #endif diff --git a/sys/compat/linuxkpi/common/src/linux_compat.c b/sys/compat/linuxkpi/common/src/linux_compat.c index c74021e32561..36ed7b84cc94 100644 --- a/sys/compat/linuxkpi/common/src/linux_compat.c +++ b/sys/compat/linuxkpi/common/src/linux_compat.c @@ -142,7 +142,8 @@ static void linux_cdev_deref(struct linux_cdev *ldev); static struct vm_area_struct *linux_cdev_handle_find(void *handle); cpumask_t cpu_online_mask; -static cpumask_t static_single_cpu_mask[MAXCPU]; +static cpumask_t **static_single_cpu_mask; +static cpumask_t *static_single_cpu_mask_lcs; struct kobject linux_class_root; struct device linux_root_device; struct class linux_class_misc; @@ -2594,17 +2595,19 @@ io_mapping_create_wc(resource_size_t base, unsigned long size) #if defined(__i386__) || defined(__amd64__) bool linux_cpu_has_clflush; struct cpuinfo_x86 boot_cpu_data; -struct cpuinfo_x86 __cpu_data[MAXCPU]; +struct cpuinfo_x86 *__cpu_data; #endif cpumask_t * lkpi_get_static_single_cpu_mask(int cpuid) { - KASSERT((cpuid >= 0 && cpuid < MAXCPU), ("%s: invalid cpuid %d\n", + KASSERT((cpuid >= 0 && cpuid <= mp_maxid), ("%s: invalid cpuid %d\n", + __func__, cpuid)); + KASSERT(!CPU_ABSENT(cpuid), ("%s: cpu with cpuid %d is absent\n", __func__, cpuid)); - return (&static_single_cpu_mask[cpuid]); + return (static_single_cpu_mask[cpuid]); } bool @@ -2659,7 +2662,9 @@ linux_compat_init(void *arg) boot_cpu_data.x86_model = CPUID_TO_MODEL(cpu_id); boot_cpu_data.x86_vendor = x86_vendor; - for (i = 0; i < MAXCPU; i++) { + __cpu_data = mallocarray(mp_maxid + 1, + sizeof(*__cpu_data), M_KMALLOC, M_WAITOK | M_ZERO); + CPU_FOREACH(i) { __cpu_data[i].x86_clflush_size = cpu_clflush_line_size; __cpu_data[i].x86_max_cores = mp_ncpus; __cpu_data[i].x86 = CPUID_TO_FAMILY(cpu_id); @@ -2695,13 +2700,92 @@ linux_compat_init(void *arg) CPU_COPY(&all_cpus, &cpu_online_mask); /* * Generate a single-CPU cpumask_t for each CPU (possibly) in the system. - * CPUs are indexed from 0..(MAXCPU-1). The entry for cpuid 0 will only + * CPUs are indexed from 0..(mp_maxid). The entry for cpuid 0 will only * have itself in the cpumask, cupid 1 only itself on entry 1, and so on. * This is used by cpumask_of() (and possibly others in the future) for, * e.g., drivers to pass hints to irq_set_affinity_hint(). */ - for (i = 0; i < MAXCPU; i++) - CPU_SET(i, &static_single_cpu_mask[i]); + static_single_cpu_mask = mallocarray(mp_maxid + 1, + sizeof(static_single_cpu_mask), M_KMALLOC, M_WAITOK | M_ZERO); + + /* + * When the number of CPUs reach a threshold, we start to save memory + * given the sets are static by overlapping those having their single + * bit set at same position in a bitset word. Asymptotically, this + * regular scheme is in O(n²) whereas the overlapping one is in O(n) + * only with n being the maximum number of CPUs, so the gain will become + * huge quite quickly. The threshold for 64-bit architectures is 128 + * CPUs. + */ + if (mp_ncpus < (2 * _BITSET_BITS)) { + cpumask_t *sscm_ptr; + + /* + * This represents 'mp_ncpus * __bitset_words(CPU_SETSIZE) * + * (_BITSET_BITS / 8)' bytes (for comparison with the + * overlapping scheme). + */ + static_single_cpu_mask_lcs = mallocarray(mp_ncpus, + sizeof(*static_single_cpu_mask_lcs), + M_KMALLOC, M_WAITOK | M_ZERO); + + sscm_ptr = static_single_cpu_mask_lcs; + CPU_FOREACH(i) { + static_single_cpu_mask[i] = sscm_ptr++; + CPU_SET(i, static_single_cpu_mask[i]); + } + } else { + /* Pointer to a bitset word. */ + __typeof(((cpuset_t *)NULL)->__bits[0]) *bwp; + + /* + * Allocate memory for (static) spans of 'cpumask_t' ('cpuset_t' + * really) with a single bit set that can be reused for all + * single CPU masks by making them start at different offsets. + * We need '__bitset_words(CPU_SETSIZE) - 1' bitset words before + * the word having its single bit set, and the same amount + * after. + */ + static_single_cpu_mask_lcs = mallocarray(_BITSET_BITS, + (2 * __bitset_words(CPU_SETSIZE) - 1) * (_BITSET_BITS / 8), + M_KMALLOC, M_WAITOK | M_ZERO); + + /* + * We rely below on cpuset_t and the bitset generic + * implementation assigning words in the '__bits' array in the + * same order of bits (i.e., little-endian ordering, not to be + * confused with machine endianness, which concerns bits in + * words and other integers). This is an imperfect test, but it + * will detect a change to big-endian ordering. + */ + _Static_assert( + __bitset_word(_BITSET_BITS + 1, _BITSET_BITS) == 1, + "Assumes a bitset implementation that is little-endian " + "on its words"); + + /* Initialize the single bit of each static span. */ + bwp = (__typeof(bwp))static_single_cpu_mask_lcs + + (__bitset_words(CPU_SETSIZE) - 1); + for (i = 0; i < _BITSET_BITS; i++) { + CPU_SET(i, (cpuset_t *)bwp); + bwp += (2 * __bitset_words(CPU_SETSIZE) - 1); + } + + /* + * Finally set all CPU masks to the proper word in their + * relevant span. + */ + CPU_FOREACH(i) { + bwp = (__typeof(bwp))static_single_cpu_mask_lcs; + /* Find the non-zero word of the relevant span. */ + bwp += (2 * __bitset_words(CPU_SETSIZE) - 1) * + (i % _BITSET_BITS) + + __bitset_words(CPU_SETSIZE) - 1; + /* Shift to find the CPU mask start. */ + bwp -= (i / _BITSET_BITS); + static_single_cpu_mask[i] = (cpuset_t *)bwp; + } + } strlcpy(init_uts_ns.name.release, osrelease, sizeof(init_uts_ns.name.release)); } @@ -2714,6 +2798,12 @@ linux_compat_uninit(void *arg) linux_kobject_kfree_name(&linux_root_device.kobj); linux_kobject_kfree_name(&linux_class_misc.kobj); + free(static_single_cpu_mask_lcs, M_KMALLOC); + free(static_single_cpu_mask, M_KMALLOC); +#if defined(__i386__) || defined(__amd64__) + free(__cpu_data, M_KMALLOC); +#endif + mtx_destroy(&vmmaplock); spin_lock_destroy(&pci_lock); rw_destroy(&linux_vma_lock);