From nobody Fri Jan 10 15:03:50 2025 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YV4ft6vWyz5kTTM; Fri, 10 Jan 2025 15:03:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YV4ft4tDlz4wyw; Fri, 10 Jan 2025 15:03:50 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1736521430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=FrdeuTTqn2vqFZXQdJPDM794tFhwvdJu9eNy4v3BQiE=; b=GW9lFSrvY5RL76l37ZKuu25n5vH51wRid712wrYtGrH/oAS9BCj6QlKFZJfAuKdeoXxHaP kof2tz+WlhOiEFFEMp91RuJc8CNF7tjemHOFCbUHq1cDqEGSO4snpDH09Byl7HVxeMdgJ0 wtwefHMDvF7NVnUNz5CEGsk0rSfXKtnJhEcuK3qrk9BYsxRZLUyQZG4EhP0YPPClf2bXpw +Ky84B5Y9MrQ0f6xX/1zA8K+ldsAJHWMHXUMnWacefcHC5Dqdnv8gXyCdaYU1x391utVbk DD08i1uD/ZAXbixFPh/Q7gieaRj7dxJd8WDgxa3oml9lXNjNTSiO6Y1/Ivp/DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1736521430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=FrdeuTTqn2vqFZXQdJPDM794tFhwvdJu9eNy4v3BQiE=; b=sR3xxGotBksUdcPhK9BkclqsHQSX2askddsTdRDs+dIrivzuMosEYa5uNyw/DvF5hRB4c8 Y5/bK/0dKqzXvzOJibIGcmFh4omd+tLrPUhy/z7E+Hdac9zGqEoGADxLV9BiY10x/2WfFy SPt/Oi7OwUN8oW4OFe3rHa2vbRGzzbfJ13NBh3UiqGv3VC1uAe76Vn8dKIC1GrLRLoQk2g 9WJQ7kFuQg4QuipB8eVb6kEAYqQZhkg+X1or7fFxMQEMH4P193dk8ITBVGAiDzEtErKPxw wRgClRDQdBMJarLBzBXbeY+ltOnWzLV9pFSkQhWh1UMk1ccsgdO/PpRmDcS0rQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1736521430; a=rsa-sha256; cv=none; b=KxZjkqq4khrImGh3Ml5z8PpIpntHdVQvpLGa4wk/fDATgnnDrZUtH1sqlfHgJfpCfTAvdO W85fkYgCG/mIOAdYr/Q6wxVTE+n1dE7btbNtwj398Mh64WVNW/5i5J/6IwgZ5ICzumHX5v x9geakOXJdUnAp0eJdaLt5lN9CgZHtj0FWysa0PKGLJ3FAsxgso8cbDKpFBDEjQj2eKZGu VLKrs/6IhZRN+Mr+3Y0ol6x41+HB87MsrWY+EjYsjumw2nxSyFtes+eOhn7s1Z5qISofYg FTdCAvo5WF7y4u9M193Ah53DTxThXBWkStu+1BH01c8MMMrkM49ppLcqYAFLaA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4YV4ft4MBcz2TP; Fri, 10 Jan 2025 15:03:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 50AF3olH057007; Fri, 10 Jan 2025 15:03:50 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 50AF3o8C057004; Fri, 10 Jan 2025 15:03:50 GMT (envelope-from git) Date: Fri, 10 Jan 2025 15:03:50 GMT Message-Id: <202501101503.50AF3o8C057004@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Robert Clausecker Subject: git: f2bd390a54f1 - main - lib/libc/aarch64/string: add strcspn optimized implementation List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: fuz X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: f2bd390a54f183f85dd7faab815740fb3bea9591 Auto-Submitted: auto-generated The branch main has been updated by fuz: URL: https://cgit.FreeBSD.org/src/commit/?id=f2bd390a54f183f85dd7faab815740fb3bea9591 commit f2bd390a54f183f85dd7faab815740fb3bea9591 Author: Getz Mikalsen AuthorDate: 2024-08-26 18:14:01 +0000 Commit: Robert Clausecker CommitDate: 2025-01-10 15:02:39 +0000 lib/libc/aarch64/string: add strcspn optimized implementation This is a port of the Scalar optimized variant of strcspn for amd64 to aarch64 It utilizes a LUT to speed up the function, a SIMD variant is still under development. Performance benchmarks are as usual generated by strperf. See the DR for benchmark results. Tested by: fuz (exprun) Reviewed by: fuz, emaste Sponsored by: Google LLC (GSoC 2024) PR: 281175 Differential Revision: https://reviews.freebsd.org/D46398 --- lib/libc/aarch64/string/Makefile.inc | 3 +- lib/libc/aarch64/string/strcspn.S | 109 +++++++++++++++++++++++++++++++++++ 2 files changed, 111 insertions(+), 1 deletion(-) diff --git a/lib/libc/aarch64/string/Makefile.inc b/lib/libc/aarch64/string/Makefile.inc index 09bfaef963eb..34483532a3dd 100644 --- a/lib/libc/aarch64/string/Makefile.inc +++ b/lib/libc/aarch64/string/Makefile.inc @@ -22,7 +22,8 @@ AARCH64_STRING_FUNCS= \ # SIMD-enhanced routines not derived from Arm's code MDSRCS+= \ strcmp.S \ - strspn.S + strspn.S \ + strcspn.S # # Add the above functions. Generate an asm file that includes the needed diff --git a/lib/libc/aarch64/string/strcspn.S b/lib/libc/aarch64/string/strcspn.S new file mode 100644 index 000000000000..8f2d6d20f0f6 --- /dev/null +++ b/lib/libc/aarch64/string/strcspn.S @@ -0,0 +1,109 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause + * + * Copyright (c) 2024 Getz Mikalsen +*/ + +#include + + .weak strcspn + .set strcspn, __strcspn + .text + +ENTRY(__strcspn) + stp x29, x30, [sp, #-16]! + mov x29, sp + mov x15, #1 // preload register with 1 for stores + + /* check for special cases */ + ldrb w4, [x1] // first character in the set + cbz w4, .Lstrlen + + movi v0.16b, #0 + + ldrb w5, [x1, #1] // second character in the set + cbz w5, .Lstrchr + + sub sp, sp, #256 // allocate 256 bytes on the stack + + /* no special case matches -- prepare lookup table */ + mov w3, #20 + .p2align 4 +0: add x9, sp, x3, lsl #3 + stp xzr, xzr, [x9] + stp xzr, xzr, [x9, #16] + subs w3, w3, #4 + b.cs 0b + + /* utilize SIMD stores to speed up zeroing the table */ + stp q0, q0, [sp, #6*32] + stp q0, q0, [sp, #7*32] + + add x1, x1, #2 + strb w15, [sp, x4] // register first chars in the set + strb w15, [sp, x5] + + mov x4, x0 // stash a copy of src + + /* process remaining chars in set */ + .p2align 4 +0: ldrb w5, [x1] + strb w15, [sp, x5] + cbz w5, 1f // end of set? + + ldrb w5, [x1, #1] + strb w15, [sp, x5] + cbz w5, 1f + + add x1, x1, #2 + b 0b + + /* find match */ + .p2align 4 +1: ldrb w8, [x0] + ldrb w9, [sp, x8] + cbnz w9, 2f + + ldrb w8, [x0, #1] + ldrb w9, [sp, x8] + cbnz w9, 3f + + ldrb w8, [x0, #2] + ldrb w9, [sp, x8] + cbnz w9, 4f + + ldrb w8, [x0, #3] + ldrb w9, [sp, x8] + add x0, x0, #4 + cbz w9, 1b + + sub x0, x0, #3 // fix up return value +4: sub x4, x4, #1 +3: add x0, x0, #1 +2: sub x0, x0, x4 + mov sp, x29 + ldp x29, x30, [sp], #16 // restore sp and lr + ret + + /* set is empty, degrades to strlen */ + .p2align 4 +.Lstrlen: + mov sp, x29 + ldp x29, x30, [sp], #16 // restore sp and lr + b strlen + + /* just one character in set, degrades to strchrnul */ + .p2align 4 +.Lstrchr: + stp x0, x1, [sp, #-16]! + mov x1, x4 + + bl strchrnul + + ldp x18, x17, [sp], #16 // restore stashed src + sub x0, x0, x18 + + ldp x29, x30, [sp], #16 // Restore sp and lr + ret + +END(__strcspn)