From nobody Tue May 09 17:11:59 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QG4TD0KXHz49CZ2; Tue, 9 May 2023 17:12:00 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QG4TC6kFYz477y; Tue, 9 May 2023 17:11:59 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683652319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=OAmTcsN1Ps6C6XItTFjIavOeF7cjrL4EVaqVROI7Du8=; b=ijDl/zOSJDSxmtmC2zyRkJYCotCiyYWZrdFoWwkh8/vlIzJk7qJ24FIyX4+Of4Zb3bfMyD Mde03vMtbWroj5B/s4CJOSdIOiuQBBgBxy37LOpkhjoyRskVYUUkW/Pj0YQg6HmMRz5Sms fM3P7IwTUHXAfAUE1VCn8xd8I0Sd9mYr1/e0OpLb+2Ld9n042RlobiVPPQCVsd3/BSGkNb nc1prmMns4oowObzV/4ewU5UQVjnDNN+8bo557fNWHeqNG4qqLMXmUDtF8AiLjcFdG9LH4 /Gu+7ya4/HxepF+7gMI6R6f+HJNtfFUw8UtsiL7CtXtXOjEiise/1A6vCPMO0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1683652319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=OAmTcsN1Ps6C6XItTFjIavOeF7cjrL4EVaqVROI7Du8=; b=k8OArDsScHvUbA3nG23FBCwc5bqXN5Nrv5TAzpMP/57bhc1DYY/uvyi04LmUx7FXd7o7KI AQ1nIDrXYZmAUQHAL0oahSiKgVcUHROxZr3I/hIhVfrLZbIs3xX5w18390rCNVD4gXSR4/ UimpTv49gUjluRAbYCNyAFCCNJCh3yHlb6ZbWfB7B71Q5RTl0zkprIMB+EY8wfj8B7h+5J PQJDKT/+n7RVTdrAmnu04zISOv+6kJlnGw/wtXfqq0xLQOzHD0IIVzkFsXDi8F3P3hx7Wv 1Cl+1ySskpxYMLVQrqKPvujJKK6GlLtesv491KS13fQpEtwcMxxKgbWurrXp6w== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1683652319; a=rsa-sha256; cv=none; b=Dhcra0cE6fZWkD896ydYHFtVQwo7s7eIWniDU5reKCq83p266Dga2yTswJX2mlMRGTIU4o Xb3J6oGpvietg+UWamUcDNVlARvoLndHZ52POIOLUCIuQviRHLJFTjqCof2qOLyx8Vivaz swNyIuB1UfID5NDpaxPoUrK0exlA7e2HLWIbfvslgP0IL1N1gRPp5snC8yemH2tQCl+sB7 I3pQPcdHiGthkMtqxVOsPABgQAg5I2yVKawKhhGl2XTTS1tC+EzsN0BOKJckHa/K/KhrCi OVHBlAGpops57a049Ctmv+37V0uTjEPJ373wjdxmvjrMqi4EJiALCyOAYLJmZQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4QG4TC5kcKzbsG; Tue, 9 May 2023 17:11:59 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 349HBx61095737; Tue, 9 May 2023 17:11:59 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 349HBx7h095736; Tue, 9 May 2023 17:11:59 GMT (envelope-from git) Date: Tue, 9 May 2023 17:11:59 GMT Message-Id: <202305091711.349HBx7h095736@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Andrew Gallatin Subject: git: 198558523361 - main - ktls: re-work alloc thread List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: gallatin X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 198558523361a654409b6d3f8d63c12ba3f72ae5 Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by gallatin: URL: https://cgit.FreeBSD.org/src/commit/?id=198558523361a654409b6d3f8d63c12ba3f72ae5 commit 198558523361a654409b6d3f8d63c12ba3f72ae5 Author: Andrew Gallatin AuthorDate: 2023-05-08 13:38:59 +0000 Commit: Andrew Gallatin CommitDate: 2023-05-09 17:09:34 +0000 ktls: re-work alloc thread When the ktls_buffer zone needs to expand, it may fail due to a lack of physically contiguous memory. We tried to rectify that by introducing an alloc thread to provide a context where it is harmless to sleep, and letting that thread repopulate the ktls_buffer zone. However, it turns out that M_WAITOK is not enough, and we must call vm_page_reclaim_contig_domain() to reclaim contig memory. Worse, M_WAITOK results in the allocation essentially busy-looping around vm_domain_alloc_fail() returning EAGIN, causing vm_page_alloc_noobj_contig_domain() to loop and resulting in the alloc thread consuming 100% CPU. To fix this, we change the alloc thread to call vm_page_reclaim_contig_domain_ext() In order to prevent the busy loop around vm_domain_alloc_fail(), we must change the uma_zalloc flags to M_NORECLAIM | M_NOWAIT. However, once that is done, these allocations become no different than the allocations done in the critical path in ktls_buffer_alloc(), so its best to just eliminate them. Since we're no longer doing allocations but just calling vm_page_reclaim_contig_domain_ext(), the name has changed to the ktls reclaim thread. Reviewed by: jhb, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D39421 --- sys/kern/uipc_ktls.c | 82 ++++++++++++++++++++++------------------------------ 1 file changed, 34 insertions(+), 48 deletions(-) diff --git a/sys/kern/uipc_ktls.c b/sys/kern/uipc_ktls.c index 4639355b1558..1e892dde9022 100644 --- a/sys/kern/uipc_ktls.c +++ b/sys/kern/uipc_ktls.c @@ -88,9 +88,9 @@ struct ktls_wq { int lastallocfail; } __aligned(CACHE_LINE_SIZE); -struct ktls_alloc_thread { +struct ktls_reclaim_thread { uint64_t wakeups; - uint64_t allocs; + uint64_t reclaims; struct thread *td; int running; }; @@ -98,7 +98,7 @@ struct ktls_alloc_thread { struct ktls_domain_info { int count; int cpu[MAXCPU]; - struct ktls_alloc_thread alloc_td; + struct ktls_reclaim_thread reclaim_td; }; struct ktls_domain_info ktls_domains[MAXMEMDOM]; @@ -154,10 +154,10 @@ SYSCTL_BOOL(_kern_ipc_tls, OID_AUTO, sw_buffer_cache, CTLFLAG_RDTUN, &ktls_sw_buffer_cache, 1, "Enable caching of output buffers for SW encryption"); -static int ktls_max_alloc = 128; -SYSCTL_INT(_kern_ipc_tls, OID_AUTO, max_alloc, CTLFLAG_RWTUN, - &ktls_max_alloc, 128, - "Max number of 16k buffers to allocate in thread context"); +static int ktls_max_reclaim = 1024; +SYSCTL_INT(_kern_ipc_tls, OID_AUTO, max_reclaim, CTLFLAG_RWTUN, + &ktls_max_reclaim, 128, + "Max number of 16k buffers to reclaim in thread context"); static COUNTER_U64_DEFINE_EARLY(ktls_tasks_active); SYSCTL_COUNTER_U64(_kern_ipc_tls, OID_AUTO, tasks_active, CTLFLAG_RD, @@ -303,7 +303,7 @@ static MALLOC_DEFINE(M_KTLS, "ktls", "Kernel TLS"); static void ktls_reset_receive_tag(void *context, int pending); static void ktls_reset_send_tag(void *context, int pending); static void ktls_work_thread(void *ctx); -static void ktls_alloc_thread(void *ctx); +static void ktls_reclaim_thread(void *ctx); static u_int ktls_get_cpu(struct socket *so) @@ -454,12 +454,12 @@ ktls_init(void) continue; if (CPU_EMPTY(&cpuset_domain[domain])) continue; - error = kproc_kthread_add(ktls_alloc_thread, + error = kproc_kthread_add(ktls_reclaim_thread, &ktls_domains[domain], &ktls_proc, - &ktls_domains[domain].alloc_td.td, - 0, 0, "KTLS", "alloc_%d", domain); + &ktls_domains[domain].reclaim_td.td, + 0, 0, "KTLS", "reclaim_%d", domain); if (error) { - printf("Can't add KTLS alloc thread %d error %d\n", + printf("Can't add KTLS reclaim thread %d error %d\n", domain, error); return (error); } @@ -2702,9 +2702,9 @@ ktls_buffer_alloc(struct ktls_wq *wq, struct mbuf *m) * see an old value of running == true. */ if (!VM_DOMAIN_EMPTY(domain)) { - running = atomic_load_int(&ktls_domains[domain].alloc_td.running); + running = atomic_load_int(&ktls_domains[domain].reclaim_td.running); if (!running) - wakeup(&ktls_domains[domain].alloc_td); + wakeup(&ktls_domains[domain].reclaim_td); } } return (buf); @@ -3121,65 +3121,51 @@ ktls_bind_domain(int domain) } static void -ktls_alloc_thread(void *ctx) +ktls_reclaim_thread(void *ctx) { struct ktls_domain_info *ktls_domain = ctx; - struct ktls_alloc_thread *sc = &ktls_domain->alloc_td; - void **buf; + struct ktls_reclaim_thread *sc = &ktls_domain->reclaim_td; struct sysctl_oid *oid; char name[80]; - int domain, error, i, nbufs; + int error, domain; domain = ktls_domain - ktls_domains; if (bootverbose) - printf("Starting KTLS alloc thread for domain %d\n", domain); + printf("Starting KTLS reclaim thread for domain %d\n", domain); error = ktls_bind_domain(domain); if (error) - printf("Unable to bind KTLS alloc thread for domain %d: error %d\n", + printf("Unable to bind KTLS reclaim thread for domain %d: error %d\n", domain, error); snprintf(name, sizeof(name), "domain%d", domain); oid = SYSCTL_ADD_NODE(NULL, SYSCTL_STATIC_CHILDREN(_kern_ipc_tls), OID_AUTO, name, CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, ""); - SYSCTL_ADD_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "allocs", - CTLFLAG_RD, &sc->allocs, 0, "buffers allocated"); + SYSCTL_ADD_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "reclaims", + CTLFLAG_RD, &sc->reclaims, 0, "buffers reclaimed"); SYSCTL_ADD_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "wakeups", CTLFLAG_RD, &sc->wakeups, 0, "thread wakeups"); SYSCTL_ADD_INT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "running", CTLFLAG_RD, &sc->running, 0, "thread running"); - buf = NULL; - nbufs = 0; for (;;) { atomic_store_int(&sc->running, 0); tsleep(sc, PZERO | PNOLOCK, "-", 0); atomic_store_int(&sc->running, 1); sc->wakeups++; - if (nbufs != ktls_max_alloc) { - free(buf, M_KTLS); - nbufs = atomic_load_int(&ktls_max_alloc); - buf = malloc(sizeof(void *) * nbufs, M_KTLS, - M_WAITOK | M_ZERO); - } /* - * Below we allocate nbufs with different allocation - * flags than we use when allocating normally during - * encryption in the ktls worker thread. We specify - * M_NORECLAIM in the worker thread. However, we omit - * that flag here and add M_WAITOK so that the VM - * system is permitted to perform expensive work to - * defragment memory. We do this here, as it does not - * matter if this thread blocks. If we block a ktls - * worker thread, we risk developing backlogs of - * buffers to be encrypted, leading to surges of - * traffic and potential NIC output drops. + * Below we attempt to reclaim ktls_max_reclaim + * buffers using vm_page_reclaim_contig_domain_ext(). + * We do this here, as this function can take several + * seconds to scan all of memory and it does not + * matter if this thread pauses for a while. If we + * block a ktls worker thread, we risk developing + * backlogs of buffers to be encrypted, leading to + * surges of traffic and potential NIC output drops. */ - for (i = 0; i < nbufs; i++) { - buf[i] = uma_zalloc(ktls_buffer_zone, M_WAITOK); - sc->allocs++; - } - for (i = 0; i < nbufs; i++) { - uma_zfree(ktls_buffer_zone, buf[i]); - buf[i] = NULL; + if (!vm_page_reclaim_contig_domain_ext(domain, VM_ALLOC_NORMAL, + atop(ktls_maxlen), 0, ~0ul, PAGE_SIZE, 0, ktls_max_reclaim)) { + vm_wait_domain(domain); + } else { + sc->reclaims += ktls_max_reclaim; } } }