git: 521c1fe0e200 - main - libc/aarch64: fix strlen() when flush-to-zero is set
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 16 Jan 2025 01:23:01 UTC
The branch main has been updated by fuz: URL: https://cgit.FreeBSD.org/src/commit/?id=521c1fe0e2002dfd7d8db86eb7144b7865229912 commit 521c1fe0e2002dfd7d8db86eb7144b7865229912 Author: Robert Clausecker <fuz@FreeBSD.org> AuthorDate: 2025-01-13 13:41:41 +0000 Commit: Robert Clausecker <fuz@FreeBSD.org> CommitDate: 2025-01-16 01:20:30 +0000 libc/aarch64: fix strlen() when flush-to-zero is set Our SIMD-enhanced strlen() implementation for AArch64 uses a floating-point comparison to compare a bit mask to zero. This works fine under normal circumstances, but fails if the FZ (flush-to-zero) flag is set in FPCR (the floating-point control register) as then the CPU no longer distinguishes denormals from zero. This was not caught during testing; this flag is rarely set and programs that do so rarely perform string manipulation. Avoid this problem by using an integer comparison instead. The performance impact seems to be small (about 0.5 %) on the Windows 2023 Dev Kit, but seems to be more significant (up to around 19%) on the RPi 5. Reviewed by: getz Fixes: 3863fec1ce2dc6033f094a085118605ea89db9e2 Differential Revision: https://reviews.freebsd.org/D48442 --- lib/libc/aarch64/string/strlen.S | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/libc/aarch64/string/strlen.S b/lib/libc/aarch64/string/strlen.S index 7bfac7f4b1e1..6fefc252eca1 100644 --- a/lib/libc/aarch64/string/strlen.S +++ b/lib/libc/aarch64/string/strlen.S @@ -33,9 +33,8 @@ ENTRY(__strlen) ldr q0, [x10, #16]! cmeq v0.16b, v0.16b, #0 shrn v0.8b, v0.8h, #4 // reduce to fit mask in GPR - fcmp d0, #0.0 - b.eq .Lloop fmov x1, d0 + cbz x1, .Lloop .Ldone: sub x0, x10, x0 rbit x1, x1 // reverse bits as NEON has no ctz