[Bug 281710] RegEXP bug in bracket expression [^...] - sed(1), grep(1), re_format(7)

From: <bugzilla-noreply_at_freebsd.org>
Date: Wed, 25 Sep 2024 19:56:24 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281710

--- Comment #5 from Eric <erichanskrs@gmail.com> ---
In reference to comment #1 & comment #2

If the referenced commit indeed solves the issue*--
I'm not experienced to judge, as it seems by various commit logs of regcomp.c
UTF-8/internal C (type) representation/handling are in play--
then, IMHO, from a committers POV, it may be considered whether this can/is to
be backported to stable/14 (or even stable/13) or have it wait until stable/15. 

I'm unfamiliar with the FreeBSD testing framework, but perhaps it might be
useful to test against each other (singleton versus non-singleton) something
like:
[1] # echo '9â' | /usr/bin/sed      -En 's/([^â])(â)/-\1-\2-/p'
[2] # echo '9â' | /usr/bin/sed      -En 's/([^ââ])(â)/-\1-\2-/p'
-9-â-

The format of [2] could be used as a temporary workaround; HT to covacat:
https://forums.freebsd.org/threads/bug-in-regexp-sed-1-grep-1-and-re_format-7.95088/#post-673304

__
* I think this can be traced back to the intro of the singleton function commit
(stable/5 (~FreeBSD 5.3); commit Jul 12, 2004):
https://github.com/freebsd/freebsd-src/commit/e5996857ad1f30f74f848a2c464c75a7ae28e59a#diff-3b4acfb4853c13cf5b563a3b6813988d5c10b057520e0c35384e4dfbc90e5793

where (apart from a minor parameter change) the function hasn't changed since.

-- 
You are receiving this mail because:
You are on the CC list for the bug.