bug with special bracket expressions in regular expressions
Damian Weber
dweber at htw-saarland.de
Mon Sep 2 15:41:04 UTC 2013
On Mon, 2 Sep 2013, Andriy Gapon wrote:
> re_format(7) says:
> There are two special cases? of bracket expressions: the bracket expres?
> sions ?[[:<:]]? and ?[[:>:]]? match the null string at the beginning and
> end of a word respectively. A word is defined as a sequence of word
> characters which is neither preceded nor followed by word characters. A
> word character is an alnum character (as defined by ctype(3)) or an
> underscore. This is an extension, compatible with but not specified by
> IEEE Std 1003.2 (?POSIX.2?), and should be used with caution in software
> intended to be portable to other systems.
>
> However I observe the following:
> $ echo "cd0 cd1 xx" | sed 's/cd[0-9][^ ]* *//g'
> xx
> $ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9][^ ]* *//g'
> cd1 xx
>
> In my opinion '[[:<:]]' should not affect how the pattern is matched in this case.
>
> Any thoughts, suggestions?
there are two simpler expressions, whose difference I don't understand either
(tested on 8.4-PRERELEASE)
$ echo "cd0 cd1 xx" | sed 's/cd[0-9] //g'
xx
$ echo "cd0 cd1 xx" | sed 's/[[:<:]]cd[0-9] //g'
cd1 xx
-- Damian
More information about the freebsd-current
mailing list