Re: git: 3d8ef251aa9d - main - lib/libc/amd64/string/strchrnul.S: fix edge case in scalar code

From: Robert Clausecker <fuz_at_freebsd.org>
Date: Fri, 25 Aug 2023 21:53:04 UTC
Hi Jessica,

> That such bugs are being found in even the *scalar* code is concerning.
> How confident are we that all the various implementations have in fact
> been adequately tested if this slipped through? We have the 14 release
> coming up, and this makes me nervous about whether all this code is
> really ready for prime time yet. This seems like the kind of thing that
> needs a long time to bake in -CURRENT so that it can be used in anger
> and, hopefully, have all the edge cases hit. What have they been tested
> against, anyway? Can we borrow test suites from other projects?

Please understand that "scalar" means "no SIMD."  It doesn't mean "naïve
implementation" (it was requested that it be possible to use libc string
functions without them touching the SSE state, so I preserved existing
scalar implementations and added them where only generic code existed).

In fact, scalar implementations are usually more complicated than the
SIMD variants as x86 does not have SIMD-like operations for general
purpose registers, so they have to be emulated with bit manipulation
tricks.  Specifically, I used the hasless(x,n) approach from Anderson's
Bit Twiddling Hacks [1].  This approach is also used by the generic
strchrnul() function, although withough the part (processing the head of
the string until alignment is reached) that caused the bug.

> That such bugs are being found in even the *scalar* code is concerning.
> How confident are we that all the various implementations have in fact
> been adequately tested if this slipped through? We have the 14 release
> coming up, and this makes me nervous about whether all this code is
> really ready for prime time yet. This seems like the kind of thing that
> needs a long time to bake in -CURRENT so that it can be used in anger
> and, hopefully, have all the edge cases hit. What have they been tested
> against, anyway? Can we borrow test suites from other projects?

Rest assured that I try my best to thoroughly test the code.  We have
comprehensive tests for libc string functions (many from NetBSD) and
where they are missing or lacking, I have tried to extend the test
suite. The ultimate test is building world with the changes applied
where many different programs thoroughly exercise all sorts of string
functions.

This is incidentally how I caught this error: while implementing
memchr() using the same approach as strchrnul(), I found that clang
would crash when using this function.  Debugging turned up this weird
corner case, reflecting my own misunderstanding of the properties of the
hasless(x,n) function (it tells you where the first match is; subsequent
matches are not guaranteed to be accurate).

I will check with re@ how to proceed with respect to these changes.  If
desired, they can be postponed to 14.1, though I think it would be nice
to get them in now rather than in a year.

[1]: https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord

PS: if you like, I can add you as a subscriber to future submissions of
this project so you can assist in code review and early testing.
Currently, D41520, D41528, D41557, and D41567 are pending review.

Yours,
Robert Clausecker

-- 
()  ascii ribbon campaign - for an 8-bit clean world 
/\  - against html email  - against proprietary attachments