From nobody Wed Jul 19 14:45:21 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4R5dsF2WJfz4nMYm; Wed, 19 Jul 2023 14:45:21 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4R5dsF1QJ8z3R4J; Wed, 19 Jul 2023 14:45:21 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1689777921; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=EHQQomC+jud4je8H2wP9zU2PTArOvu3pbI55FlONRyo=; b=fuVEz59Zjvjpxg3Zh4r5BVJilDQhDVbTN/tXFxK8PtyYB7QZNgbtvtQ+7h5p2b0wicTAm5 Cg2FfAt59lIxjnyWAFjJty+z5aQ5T+8IjiZmclNNQ+u9alvH2R95cvZImA51gTjvxug/Wi hsnFnuGaY6Uqevv6veAzz6B59i95knXL2rF0Z3SQvgUB6C1kGJq4QGJtXP3Cpg0Fpibisu oKc1TquLLum5GGViP0UVbgiglnVwhNHAFmbJlfZyoKBFYIySDGLrgHznxEiKzkT12GiCTz H3/P4xDpxqWiZj7nH5gP1oD/3+KrzMBPc2G1FPdQgdAx4r7evBfHiJNxb8U9Bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1689777921; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=EHQQomC+jud4je8H2wP9zU2PTArOvu3pbI55FlONRyo=; b=vgE/8CU1WIGAjgw3fLECCn/vSDElekRXyP1MuEglJ3dxh0I0nPV0GNyRQ67Umls5UQssMY cn9+MOPJ4tUff1Z925NPeWRi5im3+QE4OoGKjYJwzhzkiRSs1kLJ+IA084XhWsc26iCQmM HibR48kb6hwmVFpNOr2k0S2aulM3kgIVeRYOi88o1/P0yQ+D9Jdg2jSU4kl9Uk9mCcQ5Sz TcYK4xxNZV+UqkY5lkrDiV+2NaZ2lrQ4vt6qRPpT8vmn9vnsS6XjHFxkKy5I5KhYhOpSeV fjWS9Cm8LkAmiAGsQEXD6Juiwu5OKf/LFkEC/om1wEMNazOtp1++qBNxYWCGiQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1689777921; a=rsa-sha256; cv=none; b=n2xCRS5Gk+HnMe6QVIC4E/fb/VGJXV3g/O78RUpCHAcAOmjjdNhIFkq+e6VvEiq7fJfe0T LX1I0xXsR1aVJL3p5ua5zJVV5HTENxyJfhtMz4uhZrwv5efG8ERr/7+Pfl5y2BgmNU5ZgA BQjl+yDGnhXgJqW+1DxR5f5sABqekASRAUKrR9OKgxe+Q4Oo0PTTys/c6VRhyHs+C5+g+g iyUqTHqdhSIhdp4fxWAGZiBk8IYmDoWVcP5Ddx5ycjX7/JFKsijgA4fV5hF3hS+uYN4Ysg iSniE3CAcmJIvuCTUUx6zWE4VCIaUXvsoWEveNJ3IHuCtkQ8zD5YnliKPQlrJQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4R5dsF0VVJzrGT; Wed, 19 Jul 2023 14:45:21 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 36JEjLLD049968; Wed, 19 Jul 2023 14:45:21 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 36JEjLsT049967; Wed, 19 Jul 2023 14:45:21 GMT (envelope-from git) Date: Wed, 19 Jul 2023 14:45:21 GMT Message-Id: <202307191445.36JEjLsT049967@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Doug Moore Subject: git: 6f251ef228e6 - main - radix_trie: simplify ge, le lookups List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: dougm X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 6f251ef228e6ea3891cd7364c1e6d161297a2f90 Auto-Submitted: auto-generated The branch main has been updated by dougm: URL: https://cgit.FreeBSD.org/src/commit/?id=6f251ef228e6ea3891cd7364c1e6d161297a2f90 commit 6f251ef228e6ea3891cd7364c1e6d161297a2f90 Author: Doug Moore AuthorDate: 2023-07-19 14:43:31 +0000 Commit: Doug Moore CommitDate: 2023-07-19 14:43:31 +0000 radix_trie: simplify ge, le lookups Replace the implementations of lookup_le and lookup_ge with ones that do not use a stack or climb back up the tree, and instead exploit the popmap field to quickly identify the place to resume searching if the straightforward indexed search fails. The code size of the original functions shrinks by a combined 160 bytes on amd64, and the cumulative cycle count per invocation of the two functions together is reduced 20% in a buildworld test. Reviewed by: alc, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D40936 --- sys/kern/subr_pctrie.c | 290 ++++++++++++++++++++----------------------------- sys/vm/vm_radix.c | 285 +++++++++++++++++++----------------------------- 2 files changed, 228 insertions(+), 347 deletions(-) diff --git a/sys/kern/subr_pctrie.c b/sys/kern/subr_pctrie.c index ae8408a6e1ef..4cd7f4b085ba 100644 --- a/sys/kern/subr_pctrie.c +++ b/sys/kern/subr_pctrie.c @@ -472,211 +472,151 @@ pctrie_lookup_unlocked(struct pctrie *ptree, uint64_t index, smr_t smr) } /* - * Look up the nearest entry at a position bigger than or equal to index, - * assuming access is externally synchronized by a lock. + * Returns the value with the least index that is greater than or equal to the + * specified index, or NULL if there are no such values. + * + * Requires that access be externally synchronized by a lock. */ uint64_t * pctrie_lookup_ge(struct pctrie *ptree, uint64_t index) { - struct pctrie_node *stack[PCTRIE_LIMIT]; + struct pctrie_node *node, *succ; uint64_t *m; - struct pctrie_node *child, *node; -#ifdef INVARIANTS - int loops = 0; -#endif - unsigned tos; int slot; + /* + * Descend the trie as if performing an ordinary lookup for the + * specified value. However, unlike an ordinary lookup, as we descend + * the trie, we use "succ" to remember the last branching-off point, + * that is, the interior node under which the least value that is both + * outside our current path down the trie and greater than the specified + * index resides. (The node's popmap makes it fast and easy to + * recognize a branching-off point.) If our ordinary lookup fails to + * yield a value that is greater than or equal to the specified index, + * then we will exit this loop and perform a lookup starting from + * "succ". If "succ" is not NULL, then that lookup is guaranteed to + * succeed. + */ node = pctrie_root_load(ptree, NULL, PCTRIE_LOCKED); - if (node == NULL) - return (NULL); - else if (pctrie_isleaf(node)) { - m = pctrie_toval(node); - if (*m >= index) - return (m); - else - return (NULL); - } - tos = 0; - for (;;) { - /* - * If the keys differ before the current bisection node, - * then the search key might rollback to the earliest - * available bisection node or to the smallest key - * in the current node (if the owner is greater than the - * search key). - */ - if (pctrie_keybarr(node, index)) { - if (index > node->pn_owner) { -ascend: - KASSERT(++loops < 1000, - ("pctrie_lookup_ge: too many loops")); - - /* - * Pop nodes from the stack until either the - * stack is empty or a node that could have a - * matching descendant is found. - */ - do { - if (tos == 0) - return (NULL); - node = stack[--tos]; - } while (pctrie_slot(index, - node->pn_clev) == (PCTRIE_COUNT - 1)); - - /* - * The following computation cannot overflow - * because index's slot at the current level - * is less than PCTRIE_COUNT - 1. - */ - index = pctrie_trimkey(index, - node->pn_clev); - index += PCTRIE_UNITLEVEL(node->pn_clev); - } else - index = node->pn_owner; - KASSERT(!pctrie_keybarr(node, index), - ("pctrie_lookup_ge: keybarr failed")); - } - slot = pctrie_slot(index, node->pn_clev); - child = pctrie_node_load(&node->pn_child[slot], NULL, - PCTRIE_LOCKED); - if (pctrie_isleaf(child)) { - m = pctrie_toval(child); + succ = NULL; + while (node != NULL) { + if (pctrie_isleaf(node)) { + m = pctrie_toval(node); if (*m >= index) return (m); - } else if (child != NULL) - goto descend; - - /* Find the first set bit beyond the first slot+1 bits. */ - slot = ffs(node->pn_popmap & (-2 << slot)) - 1; - if (slot < 0) { + break; + } + if (pctrie_keybarr(node, index)) { /* - * A value or edge greater than the search slot is not - * found in the current node; ascend to the next - * higher-level node. + * If all values in this subtree are > index, then the + * least value in this subtree is the answer. */ - goto ascend; + if (node->pn_owner > index) + succ = node; + break; } - child = pctrie_node_load(&node->pn_child[slot], - NULL, PCTRIE_LOCKED); - KASSERT(child != NULL, ("%s: bad popmap slot %d in node %p", - __func__, slot, node)); - if (pctrie_isleaf(child)) - return (pctrie_toval(child)); - index = pctrie_trimkey(index, node->pn_clev + 1) + - slot * PCTRIE_UNITLEVEL(node->pn_clev); -descend: - KASSERT(node->pn_clev > 0, - ("pctrie_lookup_ge: pushing leaf's parent")); - KASSERT(tos < PCTRIE_LIMIT, - ("pctrie_lookup_ge: stack overflow")); - stack[tos++] = node; - node = child; + slot = pctrie_slot(index, node->pn_clev); + + /* + * Just in case the next search step leads to a subtree of all + * values < index, check popmap to see if a next bigger step, to + * a subtree of all pages with values > index, is available. If + * so, remember to restart the search here. + */ + if ((node->pn_popmap >> slot) > 1) + succ = node; + node = pctrie_node_load(&node->pn_child[slot], NULL, + PCTRIE_LOCKED); } + + /* + * Restart the search from the last place visited in the subtree that + * included some values > index, if there was such a place. + */ + if (succ == NULL) + return (NULL); + if (succ != node) { + /* + * Take a step to the next bigger sibling of the node chosen + * last time. In that subtree, all values > index. + */ + slot = pctrie_slot(index, succ->pn_clev) + 1; + KASSERT((succ->pn_popmap >> slot) != 0, + ("%s: no popmap siblings past slot %d in node %p", + __func__, slot, succ)); + slot += ffs(succ->pn_popmap >> slot) - 1; + succ = pctrie_node_load(&succ->pn_child[slot], NULL, + PCTRIE_LOCKED); + } + + /* + * Find the value in the subtree rooted at "succ" with the least index. + */ + while (!pctrie_isleaf(succ)) { + KASSERT(succ->pn_popmap != 0, + ("%s: no popmap children in node %p", __func__, succ)); + slot = ffs(succ->pn_popmap) - 1; + succ = pctrie_node_load(&succ->pn_child[slot], NULL, + PCTRIE_LOCKED); + } + return (pctrie_toval(succ)); } /* - * Look up the nearest entry at a position less than or equal to index, - * assuming access is externally synchronized by a lock. + * Returns the value with the greatest index that is less than or equal to the + * specified index, or NULL if there are no such values. + * + * Requires that access be externally synchronized by a lock. */ uint64_t * pctrie_lookup_le(struct pctrie *ptree, uint64_t index) { - struct pctrie_node *stack[PCTRIE_LIMIT]; + struct pctrie_node *node, *pred; uint64_t *m; - struct pctrie_node *child, *node; -#ifdef INVARIANTS - int loops = 0; -#endif - unsigned tos; int slot; + /* + * Mirror the implementation of pctrie_lookup_ge, described above. + */ node = pctrie_root_load(ptree, NULL, PCTRIE_LOCKED); - if (node == NULL) - return (NULL); - else if (pctrie_isleaf(node)) { - m = pctrie_toval(node); - if (*m <= index) - return (m); - else - return (NULL); - } - tos = 0; - for (;;) { - /* - * If the keys differ before the current bisection node, - * then the search key might rollback to the earliest - * available bisection node or to the largest key - * in the current node (if the owner is smaller than the - * search key). - */ + pred = NULL; + while (node != NULL) { + if (pctrie_isleaf(node)) { + m = pctrie_toval(node); + if (*m <= index) + return (m); + break; + } if (pctrie_keybarr(node, index)) { - if (index > node->pn_owner) { - index = node->pn_owner + PCTRIE_COUNT * - PCTRIE_UNITLEVEL(node->pn_clev); - } else { -ascend: - KASSERT(++loops < 1000, - ("pctrie_lookup_le: too many loops")); - - /* - * Pop nodes from the stack until either the - * stack is empty or a node that could have a - * matching descendant is found. - */ - do { - if (tos == 0) - return (NULL); - node = stack[--tos]; - } while (pctrie_slot(index, - node->pn_clev) == 0); - - /* - * The following computation cannot overflow - * because index's slot at the current level - * is greater than 0. - */ - index = pctrie_trimkey(index, - node->pn_clev); - } - index--; - KASSERT(!pctrie_keybarr(node, index), - ("pctrie_lookup_le: keybarr failed")); + if (node->pn_owner < index) + pred = node; + break; } slot = pctrie_slot(index, node->pn_clev); - child = pctrie_node_load(&node->pn_child[slot], NULL, + if ((node->pn_popmap & ((1 << slot) - 1)) != 0) + pred = node; + node = pctrie_node_load(&node->pn_child[slot], NULL, + PCTRIE_LOCKED); + } + if (pred == NULL) + return (NULL); + if (pred != node) { + slot = pctrie_slot(index, pred->pn_clev); + KASSERT((pred->pn_popmap & ((1 << slot) - 1)) != 0, + ("%s: no popmap siblings before slot %d in node %p", + __func__, slot, pred)); + slot = fls(pred->pn_popmap & ((1 << slot) - 1)) - 1; + pred = pctrie_node_load(&pred->pn_child[slot], NULL, + PCTRIE_LOCKED); + } + while (!pctrie_isleaf(pred)) { + KASSERT(pred->pn_popmap != 0, + ("%s: no popmap children in node %p", __func__, pred)); + slot = fls(pred->pn_popmap) - 1; + pred = pctrie_node_load(&pred->pn_child[slot], NULL, PCTRIE_LOCKED); - if (pctrie_isleaf(child)) { - m = pctrie_toval(child); - if (*m <= index) - return (m); - } else if (child != NULL) - goto descend; - - /* Find the last set bit among the first slot bits. */ - slot = fls(node->pn_popmap & ((1 << slot) - 1)) - 1; - if (slot < 0) { - /* - * A value or edge smaller than the search slot is not - * found in the current node; ascend to the next - * higher-level node. - */ - goto ascend; - } - child = pctrie_node_load(&node->pn_child[slot], - NULL, PCTRIE_LOCKED); - if (pctrie_isleaf(child)) - return (pctrie_toval(child)); - index = pctrie_trimkey(index, node->pn_clev + 1) + - (slot + 1) * PCTRIE_UNITLEVEL(node->pn_clev) - 1; -descend: - KASSERT(node->pn_clev > 0, - ("pctrie_lookup_le: pushing leaf's parent")); - KASSERT(tos < PCTRIE_LIMIT, - ("pctrie_lookup_le: stack overflow")); - stack[tos++] = node; - node = child; } + return (pctrie_toval(pred)); } /* diff --git a/sys/vm/vm_radix.c b/sys/vm/vm_radix.c index f6bdda70539b..6504d5031460 100644 --- a/sys/vm/vm_radix.c +++ b/sys/vm/vm_radix.c @@ -513,205 +513,146 @@ vm_radix_lookup_unlocked(struct vm_radix *rtree, vm_pindex_t index) } /* - * Look up the nearest entry at a position greater than or equal to index. + * Returns the page with the least pindex that is greater than or equal to the + * specified pindex, or NULL if there are no such pages. + * + * Requires that access be externally synchronized by a lock. */ vm_page_t vm_radix_lookup_ge(struct vm_radix *rtree, vm_pindex_t index) { - struct vm_radix_node *stack[VM_RADIX_LIMIT]; + struct vm_radix_node *rnode, *succ; vm_page_t m; - struct vm_radix_node *child, *rnode; -#ifdef INVARIANTS - int loops = 0; -#endif - int slot, tos; + int slot; + /* + * Descend the trie as if performing an ordinary lookup for the page + * with the specified pindex. However, unlike an ordinary lookup, as we + * descend the trie, we use "succ" to remember the last branching-off + * point, that is, the interior node under which the page with the least + * pindex that is both outside our current path down the trie and more + * than the specified pindex resides. (The node's popmap makes it fast + * and easy to recognize a branching-off point.) If our ordinary lookup + * fails to yield a page with a pindex that is greater than or equal to + * the specified pindex, then we will exit this loop and perform a + * lookup starting from "succ". If "succ" is not NULL, then that lookup + * is guaranteed to succeed. + */ rnode = vm_radix_root_load(rtree, LOCKED); - if (rnode == NULL) - return (NULL); - else if (vm_radix_isleaf(rnode)) { - m = vm_radix_topage(rnode); - if (m->pindex >= index) - return (m); - else - return (NULL); - } - tos = 0; - for (;;) { - /* - * If the keys differ before the current bisection node, - * then the search key might rollback to the earliest - * available bisection node or to the smallest key - * in the current node (if the owner is greater than the - * search key). - */ - if (vm_radix_keybarr(rnode, index)) { - if (index > rnode->rn_owner) { -ascend: - KASSERT(++loops < 1000, - ("vm_radix_lookup_ge: too many loops")); - - /* - * Pop nodes from the stack until either the - * stack is empty or a node that could have a - * matching descendant is found. - */ - do { - if (tos == 0) - return (NULL); - rnode = stack[--tos]; - } while (vm_radix_slot(index, - rnode->rn_clev) == (VM_RADIX_COUNT - 1)); - - /* - * The following computation cannot overflow - * because index's slot at the current level - * is less than VM_RADIX_COUNT - 1. - */ - index = vm_radix_trimkey(index, - rnode->rn_clev); - index += VM_RADIX_UNITLEVEL(rnode->rn_clev); - } else - index = rnode->rn_owner; - KASSERT(!vm_radix_keybarr(rnode, index), - ("vm_radix_lookup_ge: keybarr failed")); - } - slot = vm_radix_slot(index, rnode->rn_clev); - child = vm_radix_node_load(&rnode->rn_child[slot], LOCKED); - if (vm_radix_isleaf(child)) { - m = vm_radix_topage(child); + succ = NULL; + while (rnode != NULL) { + if (vm_radix_isleaf(rnode)) { + m = vm_radix_topage(rnode); if (m->pindex >= index) return (m); - } else if (child != NULL) - goto descend; - - /* Find the first set bit beyond the first slot+1 bits. */ - slot = ffs(rnode->rn_popmap & (-2 << slot)) - 1; - if (slot < 0) { + break; + } + if (vm_radix_keybarr(rnode, index)) { /* - * A page or edge greater than the search slot is not - * found in the current node; ascend to the next - * higher-level node. + * If all pages in this subtree have pindex > index, + * then the page in this subtree with the least pindex + * is the answer. */ - goto ascend; + if (rnode->rn_owner > index) + succ = rnode; + break; } - child = vm_radix_node_load(&rnode->rn_child[slot], LOCKED); - KASSERT(child != NULL, ("%s: bad popmap slot %d in rnode %p", - __func__, slot, rnode)); - if (vm_radix_isleaf(child)) - return (vm_radix_topage(child)); - index = vm_radix_trimkey(index, rnode->rn_clev + 1) + - slot * VM_RADIX_UNITLEVEL(rnode->rn_clev); -descend: - KASSERT(rnode->rn_clev > 0, - ("vm_radix_lookup_ge: pushing leaf's parent")); - KASSERT(tos < VM_RADIX_LIMIT, - ("vm_radix_lookup_ge: stack overflow")); - stack[tos++] = rnode; - rnode = child; + slot = vm_radix_slot(index, rnode->rn_clev); + + /* + * Just in case the next search step leads to a subtree of all + * pages with pindex < index, check popmap to see if a next + * bigger step, to a subtree of all pages with pindex > index, + * is available. If so, remember to restart the search here. + */ + if ((rnode->rn_popmap >> slot) > 1) + succ = rnode; + rnode = vm_radix_node_load(&rnode->rn_child[slot], LOCKED); + } + + /* + * Restart the search from the last place visited in the subtree that + * included some pages with pindex > index, if there was such a place. + */ + if (succ == NULL) + return (NULL); + if (succ != rnode) { + /* + * Take a step to the next bigger sibling of the node chosen + * last time. In that subtree, all pages have pindex > index. + */ + slot = vm_radix_slot(index, succ->rn_clev) + 1; + KASSERT((succ->rn_popmap >> slot) != 0, + ("%s: no popmap siblings past slot %d in node %p", + __func__, slot, succ)); + slot += ffs(succ->rn_popmap >> slot) - 1; + succ = vm_radix_node_load(&succ->rn_child[slot], LOCKED); } + + /* + * Find the page in the subtree rooted at "succ" with the least pindex. + */ + while (!vm_radix_isleaf(succ)) { + KASSERT(succ->rn_popmap != 0, + ("%s: no popmap children in node %p", __func__, succ)); + slot = ffs(succ->rn_popmap) - 1; + succ = vm_radix_node_load(&succ->rn_child[slot], LOCKED); + } + return (vm_radix_topage(succ)); } /* - * Look up the nearest entry at a position less than or equal to index. + * Returns the page with the greatest pindex that is less than or equal to the + * specified pindex, or NULL if there are no such pages. + * + * Requires that access be externally synchronized by a lock. */ vm_page_t vm_radix_lookup_le(struct vm_radix *rtree, vm_pindex_t index) { - struct vm_radix_node *stack[VM_RADIX_LIMIT]; + struct vm_radix_node *pred, *rnode; vm_page_t m; - struct vm_radix_node *child, *rnode; -#ifdef INVARIANTS - int loops = 0; -#endif - int slot, tos; + int slot; + /* + * Mirror the implementation of vm_radix_lookup_ge, described above. + */ rnode = vm_radix_root_load(rtree, LOCKED); - if (rnode == NULL) - return (NULL); - else if (vm_radix_isleaf(rnode)) { - m = vm_radix_topage(rnode); - if (m->pindex <= index) - return (m); - else - return (NULL); - } - tos = 0; - for (;;) { - /* - * If the keys differ before the current bisection node, - * then the search key might rollback to the earliest - * available bisection node or to the largest key - * in the current node (if the owner is smaller than the - * search key). - */ - if (vm_radix_keybarr(rnode, index)) { - if (index > rnode->rn_owner) { - index = rnode->rn_owner + VM_RADIX_COUNT * - VM_RADIX_UNITLEVEL(rnode->rn_clev); - } else { -ascend: - KASSERT(++loops < 1000, - ("vm_radix_lookup_le: too many loops")); - - /* - * Pop nodes from the stack until either the - * stack is empty or a node that could have a - * matching descendant is found. - */ - do { - if (tos == 0) - return (NULL); - rnode = stack[--tos]; - } while (vm_radix_slot(index, - rnode->rn_clev) == 0); - - /* - * The following computation cannot overflow - * because index's slot at the current level - * is greater than 0. - */ - index = vm_radix_trimkey(index, - rnode->rn_clev); - } - index--; - KASSERT(!vm_radix_keybarr(rnode, index), - ("vm_radix_lookup_le: keybarr failed")); - } - slot = vm_radix_slot(index, rnode->rn_clev); - child = vm_radix_node_load(&rnode->rn_child[slot], LOCKED); - if (vm_radix_isleaf(child)) { - m = vm_radix_topage(child); + pred = NULL; + while (rnode != NULL) { + if (vm_radix_isleaf(rnode)) { + m = vm_radix_topage(rnode); if (m->pindex <= index) return (m); - } else if (child != NULL) - goto descend; - - /* Find the last set bit among the first slot bits. */ - slot = fls(rnode->rn_popmap & ((1 << slot) - 1)) - 1; - if (slot < 0) { - /* - * A page or edge smaller than the search slot is not - * found in the current node; ascend to the next - * higher-level node. - */ - goto ascend; + break; } - child = vm_radix_node_load(&rnode->rn_child[slot], LOCKED); - KASSERT(child != NULL, ("%s: bad popmap slot %d in rnode %p", - __func__, slot, rnode)); - if (vm_radix_isleaf(child)) - return (vm_radix_topage(child)); - index = vm_radix_trimkey(index, rnode->rn_clev + 1) + - (slot + 1) * VM_RADIX_UNITLEVEL(rnode->rn_clev) - 1; -descend: - KASSERT(rnode->rn_clev > 0, - ("vm_radix_lookup_le: pushing leaf's parent")); - KASSERT(tos < VM_RADIX_LIMIT, - ("vm_radix_lookup_le: stack overflow")); - stack[tos++] = rnode; - rnode = child; + if (vm_radix_keybarr(rnode, index)) { + if (rnode->rn_owner < index) + pred = rnode; + break; + } + slot = vm_radix_slot(index, rnode->rn_clev); + if ((rnode->rn_popmap & ((1 << slot) - 1)) != 0) + pred = rnode; + rnode = vm_radix_node_load(&rnode->rn_child[slot], LOCKED); + } + if (pred == NULL) + return (NULL); + if (pred != rnode) { + slot = vm_radix_slot(index, pred->rn_clev); + KASSERT((pred->rn_popmap & ((1 << slot) - 1)) != 0, + ("%s: no popmap siblings before slot %d in node %p", + __func__, slot, pred)); + slot = fls(pred->rn_popmap & ((1 << slot) - 1)) - 1; + pred = vm_radix_node_load(&pred->rn_child[slot], LOCKED); + } + while (!vm_radix_isleaf(pred)) { + KASSERT(pred->rn_popmap != 0, + ("%s: no popmap children in node %p", __func__, pred)); + slot = fls(pred->rn_popmap) - 1; + pred = vm_radix_node_load(&pred->rn_child[slot], LOCKED); } + return (vm_radix_topage(pred)); } /*