grep and anchoring

Polytropon freebsd at edvax.de
Sun Jun 26 15:59:16 UTC 2016


On Sun, 26 Jun 2016 16:44:44 +0200, Daniël de Kok wrote:
> On 26 Jun 2016, at 16:34, Polytropon <freebsd at edvax.de> wrote:
> > Or is this just an "enrichment" your MUA added? :-)
> 
> Yes, Mac’s Mail.app likes to replace these. I didn’t use an ellipsis
> in the actual expression ;), just four dots.

And it also seems to turn the apostrophe ' into a single
closing quote ’. :-)



> > 	% echo "1234 1234 1234" | egrep -o '^....'
> > 	1234
> > 	 123
> > 	4 12
> [...]
> > First 4-character pattern is "1234", next is " 123",
> > and last is "4 12" (each 4 characters wide, as the
> > space character " " is also "any character" that matches
> > the . pattern). In the second example, the groups match
> > 4 characters each ("1234" x 3).
> 
> Note the anchoring (^), the pattern should only match any four
> characters at the beginning of the line, so the expected output
> is ‘1234’ and nothing more. ‘ 123' and '4 12' are not at the
> beginning of the line and should consequently not be printed
> to stdout.

You're right; according to "man grep":

       -o, --only-matching
              Show only the part of a matching line that matches PATTERN.

the first pattern matching "^...." should be the first 4 digits,
the output should then stop, which really looks like a bug. Instead
the pattern matching is repeated over the rest of the input line
(leading to two "additional results").



> For comparison, the output of a recent GNU grep:
> 
>> %  echo "1234 1234 1234" | grep -o '^....'
> 1234
>
That is what _should_ happen, correct. Thanks for clarifying.



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...


More information about the freebsd-questions mailing list