svn commit: r354628 - in stable/11: contrib/netbsd-tests/usr.bin/grep usr.bin/grep usr.bin/grep/tests
Kyle Evans
kevans at FreeBSD.org
Mon Nov 11 19:54:10 UTC 2019
Author: kevans
Date: Mon Nov 11 19:54:08 2019
New Revision: 354628
URL: https://svnweb.freebsd.org/changeset/base/354628
Log:
MFC bsdgrep(1) fixes: r320414, r328559, r332805-r332806, r332809, r332832,
r332850-r332852, r332856, r332858, r332876, r333351, r334803,
r334806-r334809, r334821, r334837, r334889, r335188, r351769, r352691
r320414:
Expect :mmap_eof_not_eol to fail
It relies on a jemalloc feature (opt.redzone) no longer available after
r319971.
r328559:
Remove t_grep:mmap_eof_not_eol test
The test was marked as an expected failure in r320414 after r319971's import
of a newer jemalloc removed an essential feature (opt.redzone) for
reproducing the behavior it was testing. Since then, no way has been found
or demonstrated to reliably test the behavior, so remove the test.
r332805:
bsdgrep: Split match processing out of procfile
procfile is getting kind of hairy, and it's not going to get better as we
correct some more bits that assume we process one line at a time.
r332806:
bsdgrep: Clean up procmatches a little bit
r332809:
bsdgrep: Add some TODOs for future work on operating on chunks
r332832:
bsdgrep: Break procmatches down a little bit more
Split the matching and non-matching cases out into their own functions to
reduce future complexity. As the name implies, procmatches will eventually
process more than one match itself in the future.
r332850:
bsdgrep: Some light cleanup
There's no point checking for a bunch of file modes if we're not a
practicing believer of DIR_SKIP or DEV_SKIP.
This also reduces some style violations that were particularly ugly looking
when browsing through.
r332851:
bsdgrep: More trivial cleanup/style cleanup
We can avoid branching for these easily reduced patterns
r332852:
bsdgrep: if chain => switch
This makes some of this a little easier to follow (in my opinion).
r332856:
bsdgrep: Fix --include/--exclude ordering issues
Prior to r332851:
* --exclude always win out over --include
* --exclude-dir always wins out over --include-dir
r332851 broke that behavior, resulting in:
* First of --exclude, --include wins
* First of --exclude-dir, --include-dir wins
As it turns out, both behaviors are wrong by modern grep standards- the
latest rule wins. e.g.:
`grep --exclude foo --include foo 'thing' foo`
foo is included
`grep --include foo --exclude foo 'thing' foo`
foo is excluded
As tested with GNU grep 3.1.
This commit makes bsdgrep follow this behavior.
r332858:
bsdgrep: Use grep_strdup instead of grep_malloc+strcpy
r332876:
bsdgrep: Fix build failure WITHOUT_LZMA (incorrect bracket placement)
r333351:
bsdgrep: Allow "-" to be passed to -f to mean "standard input"
A version of this patch was originally sent to me by se@, matching behavior
from newer versions of GNU grep.
While there have been some differences of opinion on whether stdin should be
closed or not after depleting it in process of -f, I've opted to leave stdin
open and just let the later matching stuff fail and result in a no-match.
I'm not married to the current behavior- it was generally chosen since we
are adopting this in particular from GNU grep, and I would like to stay
consistent without a strong argument to the contrary. The current behavior
isn't technically wrong, it's just fairly unfriendly to the developer-user
of grep that may not realize their usage is trivially invalid.
r334803:
netbsd-tests: grep(1): Add test for -c flag
Someone might be inclined to accidentally break this. someone might have
written said test because they broke it locally.
r334806:
bsdgrep(1): Do some less dirty things with return types
Neither procfile nor grep_tree return anything meaningful to their callers.
None of the callers actually care about how many lines were matched in all
of the files they processed; it's all about "did anything match?"
This is generally just a light refactoring to remind me of what actually
matters as I'm rewriting these bits to care less about 'stuff'.
r334807:
bsdgrep(1): whoops, garbage collect the now write-only variable
r334808:
bsdgrep(1): Don't initialize fts_flags twice
Admittedly, this is a clang-scan complaint... but it wasn't wrong. fts_flags
is initialized by all cases in the switch(), which should be fairly obvious.
Annotate this anyways.
r334809:
netbsd-tests: bsdgrep(1): Add a test for -m, too
r334821:
bsdgrep(1): Slooowly peel away the chunky onion
(or peel off the band-aid, whatever floats your boat)
This addresses two separate issues:
1.) Nothing within bsdgrep actually knew whether it cared about line numbers
or not.
2.) The file layer knew nothing about the context in which it was being
called.
#1 is only important when we're *not* processing line-by-line. #2 is
debatably a good idea; the parsing context is only handy because that's
where we store current offset information and, as of this commit, whether or
not it needs to be line-aware.
r334837:
bsdgrep(1): Evict character sequence that moved in
r334889:
bsdgrep(1): Some more int -> bool conversions and name changes
Again motivated by upcoming work to rewrite a bunch of this- single-letter
variable names and slightly misleading variable names ("lastmatches" to
indicate that the last matched) are not helpful.
r335188:
bsdgrep(1): Remove redundant initialization; unconditionally assigned later
r351769:
bsdgrep(1): add some basic tests for some GNU Extension support
These will be expanded later as I come up with good test cases; for now,
these seem to be enough to trigger bugs in base gnugrep and expose missing
features in bsdgrep.
r352691:
bsdgrep(1): various fixes of empty pattern/exit code/-c behavior
When an empty pattern is encountered in the pattern list, I had previously
broken bsdgrep to count that as a "match all" and ignore any other patterns
in the list. This commit rectifies that mistake, among others:
- The -v flag semantics were not quite right; lines matched should have been
counted differently based on whether the -v flag was set or not. procline
now definitively returns whether it's matched or not, and interpreting
that result has been kicked up a level.
- Empty patterns with the -x flag was broken similarly to empty patterns
with the -w flag. The former is a whole-line match and should be more
strict, only matching blank lines. No -x and no -w will will match the
empty string at the beginning of each line.
- The exit code with -L was broken, w.r.t. modern grep. Modern grap will
exit(0) if any file that didn't match was output, so our interpretation
was simply backwards. The new interpretation makes sense to me.
Tests updated and added to try and catch some of this.
This misbehavior was found by autoconf while fixing ports found in PR 229925
expecting either a more sane or a more GNU-like sed.
Modified:
stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh
stable/11/usr.bin/grep/file.c
stable/11/usr.bin/grep/grep.1
stable/11/usr.bin/grep/grep.c
stable/11/usr.bin/grep/grep.h
stable/11/usr.bin/grep/tests/grep_freebsd_test.sh
stable/11/usr.bin/grep/util.c
Directory Properties:
stable/11/ (props changed)
Modified: stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh
==============================================================================
--- stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh Mon Nov 11 19:06:04 2019 (r354627)
+++ stable/11/contrib/netbsd-tests/usr.bin/grep/t_grep.sh Mon Nov 11 19:54:08 2019 (r354628)
@@ -413,6 +413,60 @@ wflag_emptypat_body()
atf_check -o file:test4 grep -w -e "" test4
}
+atf_test_case xflag_emptypat
+xflag_emptypat_body()
+{
+ printf "" > test1
+ printf "\n" > test2
+ printf "qaz" > test3
+ printf " qaz\n" > test4
+
+ # -x is whole-line, more strict than -w.
+ atf_check -s exit:1 -o empty grep -x -e "" test1
+
+ atf_check -o file:test2 grep -x -e "" test2
+
+ atf_check -s exit:1 -o empty grep -x -e "" test3
+
+ atf_check -s exit:1 -o empty grep -x -e "" test4
+
+ total=$(wc -l /COPYRIGHT | sed 's/[^0-9]//g')
+
+ # Simple checks that grep -x with an empty pattern isn't matching every
+ # line. The exact counts aren't important, as long as they don't
+ # match the total line count and as long as they don't match each other.
+ atf_check -o save:xpositive.count grep -Fxc '' /COPYRIGHT
+ atf_check -o save:xnegative.count grep -Fvxc '' /COPYRIGHT
+
+ atf_check -o not-inline:"${total}" cat xpositive.count
+ atf_check -o not-inline:"${total}" cat xnegative.count
+
+ atf_check -o not-file:xnegative.count cat xpositive.count
+}
+
+atf_test_case xflag_emptypat_plus
+xflag_emptypat_plus_body()
+{
+ printf "foo\n\nbar\n\nbaz\n" > target
+ printf "foo\n \nbar\n \nbaz\n" > target_spacelines
+ printf "foo\nbar\nbaz\n" > matches
+ printf " \n \n" > spacelines
+
+ printf "foo\n\nbar\n\nbaz\n" > patlist1
+ printf "foo\n\nba\n\nbaz\n" > patlist2
+
+ sed -e '/bar/d' target > matches_not2
+
+ # Normal handling first
+ atf_check -o file:target grep -Fxf patlist1 target
+ atf_check -o file:matches grep -Fxf patlist1 target_spacelines
+ atf_check -o file:matches_not2 grep -Fxf patlist2 target
+
+ # -v handling
+ atf_check -s exit:1 -o empty grep -Fvxf patlist1 target
+ atf_check -o file:spacelines grep -Fxvf patlist1 target_spacelines
+}
+
atf_test_case excessive_matches
excessive_matches_head()
{
@@ -551,6 +605,12 @@ grep_nomatch_flags_head()
grep_nomatch_flags_body()
{
+ grep_type
+
+ if [ $? -eq $GREP_TYPE_GNU_FREEBSD ]; then
+ atf_expect_fail "this test does not pass with GNU grep in base"
+ fi
+
printf "A\nB\nC\n" > test1
atf_check -o inline:"1\n" grep -c -C 1 -e "B" test1
@@ -563,7 +623,7 @@ grep_nomatch_flags_body()
atf_check -o inline:"test1\n" grep -l -A 1 -e "B" test1
atf_check -o inline:"test1\n" grep -l -C 1 -e "B" test1
- atf_check -s exit:1 -o inline:"test1\n" grep -L -e "D" test1
+ atf_check -o inline:"test1\n" grep -L -e "D" test1
atf_check -o empty grep -q -e "B" test1
atf_check -o empty grep -q -B 1 -e "B" test1
@@ -646,28 +706,6 @@ mmap_body()
atf_check -s exit:1 grep --mmap -e "Z" test1
}
-atf_test_case mmap_eof_not_eol
-mmap_eof_not_eol_head()
-{
- atf_set "descr" "Check --mmap flag handling of encountering EOF without EOL (PR 165471, 219402)"
-}
-mmap_eof_not_eol_body()
-{
- grep_type
- if [ $? -eq $GREP_TYPE_GNU ]; then
- atf_expect_fail "gnu grep from ports has no --mmap option"
- fi
-
- printf "ABC" > test1
- jot -b " " -s "" 4096 >> test2
-
- atf_check -s exit:0 -o inline:"B\n" grep --mmap -oe "B" test1
- # Dependency on jemalloc(3) to detect buffer overflow, otherwise this
- # unreliably produces a SIGSEGV or SIGBUS
- atf_check -s exit:0 -o not-empty \
- env MALLOC_CONF="redzone:true" grep --mmap -e " " test2
-}
-
atf_test_case matchall
matchall_head()
{
@@ -738,6 +776,38 @@ fgrep_oflag_body()
atf_check -s exit:1 grep -Fo "ghix" test1
atf_check -s exit:1 grep -Fo "abcdefghiklmnopqrstuvwxyz" test1
}
+
+atf_test_case cflag
+cflag_head()
+{
+ atf_set "descr" "Check proper handling of -c"
+}
+cflag_body()
+{
+ printf "a\nb\nc\n" > test1
+
+ atf_check -o inline:"1\n" grep -Ec "a" test1
+ atf_check -o inline:"2\n" grep -Ec "a|b" test1
+ atf_check -o inline:"3\n" grep -Ec "a|b|c" test1
+
+ atf_check -o inline:"test1:2\n" grep -EHc "a|b" test1
+}
+
+atf_test_case mflag
+mflag_head()
+{
+ atf_set "descr" "Check proper handling of -m"
+}
+mflag_body()
+{
+ printf "a\nb\nc\nd\ne\nf\n" > test1
+
+ atf_check -o inline:"1\n" grep -m 1 -Ec "a" test1
+ atf_check -o inline:"2\n" grep -m 2 -Ec "a|b" test1
+ atf_check -o inline:"3\n" grep -m 3 -Ec "a|b|c|f" test1
+
+ atf_check -o inline:"test1:2\n" grep -m 2 -EHc "a|b|e|f" test1
+}
# End FreeBSD
atf_init_test_cases()
@@ -767,6 +837,8 @@ atf_init_test_cases()
atf_add_test_case egrep_empty_invalid
atf_add_test_case zerolen
atf_add_test_case wflag_emptypat
+ atf_add_test_case xflag_emptypat
+ atf_add_test_case xflag_emptypat_plus
atf_add_test_case excessive_matches
atf_add_test_case wv_combo_break
atf_add_test_case fgrep_sanity
@@ -777,10 +849,11 @@ atf_init_test_cases()
atf_add_test_case binary_flags
atf_add_test_case badcontext
atf_add_test_case mmap
- atf_add_test_case mmap_eof_not_eol
atf_add_test_case matchall
atf_add_test_case fgrep_multipattern
atf_add_test_case fgrep_icase
atf_add_test_case fgrep_oflag
+ atf_add_test_case cflag
+ atf_add_test_case mflag
# End FreeBSD
}
Modified: stable/11/usr.bin/grep/file.c
==============================================================================
--- stable/11/usr.bin/grep/file.c Mon Nov 11 19:06:04 2019 (r354627)
+++ stable/11/usr.bin/grep/file.c Mon Nov 11 19:54:08 2019 (r354628)
@@ -86,6 +86,9 @@ static inline int
grep_refill(struct file *f)
{
ssize_t nr;
+#ifndef WITHOUT_LZMA
+ lzma_ret lzmaret;
+#endif
if (filebehave == FILE_MMAP)
return (0);
@@ -93,41 +96,52 @@ grep_refill(struct file *f)
bufpos = buffer;
bufrem = 0;
- if (filebehave == FILE_GZIP) {
+ switch (filebehave) {
+ case FILE_GZIP:
nr = gzread(gzbufdesc, buffer, MAXBUFSIZ);
+ break;
#ifndef WITHOUT_BZIP2
- } else if (filebehave == FILE_BZIP && bzbufdesc != NULL) {
- int bzerr;
+ case FILE_BZIP:
+ if (bzbufdesc != NULL) {
+ int bzerr;
- nr = BZ2_bzRead(&bzerr, bzbufdesc, buffer, MAXBUFSIZ);
- switch (bzerr) {
- case BZ_OK:
- case BZ_STREAM_END:
- /* No problem, nr will be okay */
- break;
- case BZ_DATA_ERROR_MAGIC:
+ nr = BZ2_bzRead(&bzerr, bzbufdesc, buffer, MAXBUFSIZ);
+ switch (bzerr) {
+ case BZ_OK:
+ case BZ_STREAM_END:
+ /* No problem, nr will be okay */
+ break;
+ case BZ_DATA_ERROR_MAGIC:
+ /*
+ * As opposed to gzread(), which simply returns the
+ * plain file data, if it is not in the correct
+ * compressed format, BZ2_bzRead() instead aborts.
+ *
+ * So, just restart at the beginning of the file again,
+ * and use plain reads from now on.
+ */
+ BZ2_bzReadClose(&bzerr, bzbufdesc);
+ bzbufdesc = NULL;
+ if (lseek(f->fd, 0, SEEK_SET) == -1)
+ return (-1);
+ nr = read(f->fd, buffer, MAXBUFSIZ);
+ break;
+ default:
+ /* Make sure we exit with an error */
+ nr = -1;
+ }
+ } else
/*
- * As opposed to gzread(), which simply returns the
- * plain file data, if it is not in the correct
- * compressed format, BZ2_bzRead() instead aborts.
- *
- * So, just restart at the beginning of the file again,
- * and use plain reads from now on.
+ * Also an error case; we should never have a scenario
+ * where we have an open file but no bzip descriptor
+ * at this point. See: grep_open
*/
- BZ2_bzReadClose(&bzerr, bzbufdesc);
- bzbufdesc = NULL;
- if (lseek(f->fd, 0, SEEK_SET) == -1)
- return (-1);
- nr = read(f->fd, buffer, MAXBUFSIZ);
- break;
- default:
- /* Make sure we exit with an error */
nr = -1;
- }
+ break;
#endif
#ifndef WITHOUT_LZMA
- } else if ((filebehave == FILE_XZ) || (filebehave == FILE_LZMA)) {
- lzma_ret ret;
+ case FILE_XZ:
+ case FILE_LZMA:
lstrm.next_out = buffer;
do {
@@ -143,23 +157,23 @@ grep_refill(struct file *f)
lstrm.avail_in = nr;
}
- ret = lzma_code(&lstrm, laction);
+ lzmaret = lzma_code(&lstrm, laction);
- if (ret != LZMA_OK && ret != LZMA_STREAM_END)
+ if (lzmaret != LZMA_OK && lzmaret != LZMA_STREAM_END)
return (-1);
- if (lstrm.avail_out == 0 || ret == LZMA_STREAM_END) {
+ if (lstrm.avail_out == 0 || lzmaret == LZMA_STREAM_END) {
bufrem = MAXBUFSIZ - lstrm.avail_out;
lstrm.next_out = buffer;
lstrm.avail_out = MAXBUFSIZ;
}
- } while (bufrem == 0 && ret != LZMA_STREAM_END);
+ } while (bufrem == 0 && lzmaret != LZMA_STREAM_END);
return (0);
-#endif /* WIHTOUT_LZMA */
- } else
+#endif /* WITHOUT_LZMA */
+ default:
nr = read(f->fd, buffer, MAXBUFSIZ);
-
+ }
if (nr < 0)
return (-1);
@@ -180,7 +194,7 @@ grep_lnbufgrow(size_t newlen)
}
char *
-grep_fgetln(struct file *f, size_t *lenp)
+grep_fgetln(struct file *f, struct parsec *pc)
{
unsigned char *p;
char *ret;
@@ -194,7 +208,7 @@ grep_fgetln(struct file *f, size_t *lenp)
if (bufrem == 0) {
/* Return zero length to indicate EOF */
- *lenp = 0;
+ pc->ln.len= 0;
return (bufpos);
}
@@ -205,7 +219,7 @@ grep_fgetln(struct file *f, size_t *lenp)
len = p - bufpos;
bufrem -= len;
bufpos = p;
- *lenp = len;
+ pc->ln.len = len;
return (ret);
}
@@ -240,11 +254,11 @@ grep_fgetln(struct file *f, size_t *lenp)
bufpos = p;
break;
}
- *lenp = len;
+ pc->ln.len = len;
return (lnbuf);
error:
- *lenp = 0;
+ pc->ln.len = 0;
return (NULL);
}
@@ -255,6 +269,9 @@ struct file *
grep_open(const char *path)
{
struct file *f;
+#ifndef WITHOUT_LZMA
+ lzma_ret lzmaret;
+#endif
f = grep_malloc(sizeof *f);
memset(f, 0, sizeof *f);
@@ -292,32 +309,36 @@ grep_open(const char *path)
if ((buffer == NULL) || (buffer == MAP_FAILED))
buffer = grep_malloc(MAXBUFSIZ);
- if (filebehave == FILE_GZIP &&
- (gzbufdesc = gzdopen(f->fd, "r")) == NULL)
- goto error2;
-
+ switch (filebehave) {
+ case FILE_GZIP:
+ if ((gzbufdesc = gzdopen(f->fd, "r")) == NULL)
+ goto error2;
+ break;
#ifndef WITHOUT_BZIP2
- if (filebehave == FILE_BZIP &&
- (bzbufdesc = BZ2_bzdopen(f->fd, "r")) == NULL)
- goto error2;
+ case FILE_BZIP:
+ if ((bzbufdesc = BZ2_bzdopen(f->fd, "r")) == NULL)
+ goto error2;
+ break;
#endif
#ifndef WITHOUT_LZMA
- else if ((filebehave == FILE_XZ) || (filebehave == FILE_LZMA)) {
- lzma_ret ret;
+ case FILE_XZ:
+ case FILE_LZMA:
- ret = (filebehave == FILE_XZ) ?
- lzma_stream_decoder(&lstrm, UINT64_MAX,
- LZMA_CONCATENATED) :
- lzma_alone_decoder(&lstrm, UINT64_MAX);
+ if (filebehave == FILE_XZ)
+ lzmaret = lzma_stream_decoder(&lstrm, UINT64_MAX,
+ LZMA_CONCATENATED);
+ else
+ lzmaret = lzma_alone_decoder(&lstrm, UINT64_MAX);
- if (ret != LZMA_OK)
+ if (lzmaret != LZMA_OK)
goto error2;
lstrm.avail_in = 0;
lstrm.avail_out = MAXBUFSIZ;
laction = LZMA_RUN;
- }
+ break;
#endif
+ }
/* Fill read buffer, also catches errors early */
if (bufrem == 0 && grep_refill(f) != 0)
@@ -326,7 +347,7 @@ grep_open(const char *path)
/* Check for binary stuff, if necessary */
if (binbehave != BINFILE_TEXT && fileeol != '\0' &&
memchr(bufpos, '\0', bufrem) != NULL)
- f->binary = true;
+ f->binary = true;
return (f);
Modified: stable/11/usr.bin/grep/grep.1
==============================================================================
--- stable/11/usr.bin/grep/grep.1 Mon Nov 11 19:06:04 2019 (r354627)
+++ stable/11/usr.bin/grep/grep.1 Mon Nov 11 19:54:08 2019 (r354628)
@@ -30,7 +30,7 @@
.\"
.\" @(#)grep.1 8.3 (Berkeley) 4/18/94
.\"
-.Dd April 17, 2017
+.Dd May 7, 2018
.Dt GREP 1
.Os
.Sh NAME
@@ -410,6 +410,13 @@ and block buffered otherwise.
.El
.Pp
If no file arguments are specified, the standard input is used.
+Additionally,
+.Dq -
+may be used in place of a file name, anywhere that a file name is accepted, to
+read from standard input.
+This includes both
+.Fl f
+and file arguments.
.Sh EXIT STATUS
The
.Nm grep
Modified: stable/11/usr.bin/grep/grep.c
==============================================================================
--- stable/11/usr.bin/grep/grep.c Mon Nov 11 19:06:04 2019 (r354627)
+++ stable/11/usr.bin/grep/grep.c Mon Nov 11 19:54:08 2019 (r354628)
@@ -239,20 +239,9 @@ static void
add_pattern(char *pat, size_t len)
{
- /* Do not add further pattern is we already match everything */
- if (matchall)
- return;
-
/* Check if we can do a shortcut */
if (len == 0) {
matchall = true;
- for (unsigned int i = 0; i < patterns; i++) {
- free(pattern[i].pat);
- }
- pattern = grep_realloc(pattern, sizeof(struct pat));
- pattern[0].pat = NULL;
- pattern[0].len = 0;
- patterns = 1;
return;
}
/* Increase size if necessary */
@@ -319,7 +308,9 @@ read_patterns(const char *fn)
size_t len;
ssize_t rlen;
- if ((f = fopen(fn, "r")) == NULL)
+ if (strcmp(fn, "-") == 0)
+ f = stdin;
+ else if ((f = fopen(fn, "r")) == NULL)
err(2, "%s", fn);
if ((fstat(fileno(f), &st) == -1) || (S_ISDIR(st.st_mode))) {
fclose(f);
@@ -336,7 +327,8 @@ read_patterns(const char *fn)
free(line);
if (ferror(f))
err(2, "%s", fn);
- fclose(f);
+ if (strcmp(fn, "-") != 0)
+ fclose(f);
}
static inline const char *
@@ -357,6 +349,7 @@ main(int argc, char *argv[])
long long l;
unsigned int aargc, eargc, i;
int c, lastc, needpattern, newarg, prevoptind;
+ bool matched;
setlocale(LC_ALL, "");
@@ -701,7 +694,7 @@ main(int argc, char *argv[])
aargv += optind;
/* Empty pattern file matches nothing */
- if (!needpattern && (patterns == 0))
+ if (!needpattern && (patterns == 0) && !matchall)
exit(1);
/* Fail if we don't have any pattern */
@@ -751,11 +744,10 @@ main(int argc, char *argv[])
#endif
r_pattern = grep_calloc(patterns, sizeof(*r_pattern));
- /* Don't process any patterns if we have a blank one */
#ifdef WITH_INTERNAL_NOSPEC
- if (!matchall && grepbehave != GREP_FIXED) {
+ if (grepbehave != GREP_FIXED) {
#else
- if (!matchall) {
+ {
#endif
/* Check if cheating is allowed (always is for fgrep). */
for (i = 0; i < patterns; ++i) {
@@ -787,12 +779,13 @@ main(int argc, char *argv[])
exit(!procfile("-"));
if (dirbehave == DIR_RECURSE)
- c = grep_tree(aargv);
+ matched = grep_tree(aargv);
else
- for (c = 0; aargc--; ++aargv) {
+ for (matched = false; aargc--; ++aargv) {
if ((finclude || fexclude) && !file_matching(*aargv))
continue;
- c+= procfile(*aargv);
+ if (procfile(*aargv))
+ matched = true;
}
#ifndef WITHOUT_NLS
@@ -801,5 +794,8 @@ main(int argc, char *argv[])
/* Find out the correct return value according to the
results and the command line option. */
- exit(c ? (file_err ? (qflag ? 0 : 2) : 0) : (file_err ? 2 : 1));
+ if (Lflag)
+ matched = !matched;
+
+ exit(matched ? (file_err ? (qflag ? 0 : 2) : 0) : (file_err ? 2 : 1));
}
Modified: stable/11/usr.bin/grep/grep.h
==============================================================================
--- stable/11/usr.bin/grep/grep.h Mon Nov 11 19:06:04 2019 (r354627)
+++ stable/11/usr.bin/grep/grep.h Mon Nov 11 19:54:08 2019 (r354628)
@@ -114,6 +114,21 @@ struct epat {
int mode;
};
+/*
+ * Parsing context; used to hold things like matches made and
+ * other useful bits
+ */
+struct parsec {
+ regmatch_t matches[MAX_MATCHES]; /* Matches made */
+ /* XXX TODO: This should be a chunk, not a line */
+ struct str ln; /* Current line */
+ size_t lnstart; /* Position in line */
+ size_t matchidx; /* Latest match index */
+ int printed; /* Metadata printed? */
+ bool binary; /* Binary file? */
+ bool cntlines; /* Count lines? */
+};
+
/* Flags passed to regcomp() and regexec() */
extern int cflags, eflags;
@@ -145,8 +160,8 @@ extern char re_error[RE_ERROR_BUF + 1]; /* Seems big
/* util.c */
bool file_matching(const char *fname);
-int procfile(const char *fn);
-int grep_tree(char **argv);
+bool procfile(const char *fn);
+bool grep_tree(char **argv);
void *grep_malloc(size_t size);
void *grep_calloc(size_t nmemb, size_t size);
void *grep_realloc(void *ptr, size_t size);
@@ -161,4 +176,4 @@ void clearqueue(void);
/* file.c */
void grep_close(struct file *f);
struct file *grep_open(const char *path);
-char *grep_fgetln(struct file *f, size_t *len);
+char *grep_fgetln(struct file *f, struct parsec *pc);
Modified: stable/11/usr.bin/grep/tests/grep_freebsd_test.sh
==============================================================================
--- stable/11/usr.bin/grep/tests/grep_freebsd_test.sh Mon Nov 11 19:06:04 2019 (r354627)
+++ stable/11/usr.bin/grep/tests/grep_freebsd_test.sh Mon Nov 11 19:54:08 2019 (r354628)
@@ -81,8 +81,34 @@ rgrep_body()
atf_check -o file:d_grep_r_implied.out rgrep --exclude="*.out" -e "test" "$(atf_get_srcdir)"
}
+atf_test_case gnuext
+gnuext_body()
+{
+ grep_type
+ _type=$?
+ if [ $_type -eq $GREP_TYPE_BSD ]; then
+ atf_expect_fail "this test requires GNU extensions in regex(3)"
+ elif [ $_type -eq $GREP_TYPE_GNU_FREEBSD ]; then
+ atf_expect_fail "\\s and \\S are known to be buggy in base gnugrep"
+ fi
+
+ atf_check -o save:grep_alnum.out grep -o '[[:alnum:]]' /COPYRIGHT
+ atf_check -o file:grep_alnum.out grep -o '\w' /COPYRIGHT
+
+ atf_check -o save:grep_nalnum.out grep -o '[^[:alnum:]]' /COPYRIGHT
+ atf_check -o file:grep_nalnum.out grep -o '\W' /COPYRIGHT
+
+ atf_check -o save:grep_space.out grep -o '[[:space:]]' /COPYRIGHT
+ atf_check -o file:grep_space.out grep -o '\s' /COPYRIGHT
+
+ atf_check -o save:grep_nspace.out grep -o '[^[:space:]]' /COPYRIGHT
+ atf_check -o file:grep_nspace.out grep -o '\S' /COPYRIGHT
+
+}
+
atf_init_test_cases()
{
atf_add_test_case grep_r_implied
atf_add_test_case rgrep
+ atf_add_test_case gnuext
}
Modified: stable/11/usr.bin/grep/util.c
==============================================================================
--- stable/11/usr.bin/grep/util.c Mon Nov 11 19:06:04 2019 (r354627)
+++ stable/11/usr.bin/grep/util.c Mon Nov 11 19:54:08 2019 (r354628)
@@ -60,23 +60,24 @@ __FBSDID("$FreeBSD$");
static bool first_match = true;
/*
- * Parsing context; used to hold things like matches made and
- * other useful bits
+ * Match printing context
*/
-struct parsec {
- regmatch_t matches[MAX_MATCHES]; /* Matches made */
- struct str ln; /* Current line */
- size_t lnstart; /* Position in line */
- size_t matchidx; /* Latest match index */
- int printed; /* Metadata printed? */
- bool binary; /* Binary file? */
+struct mprintc {
+ long long tail; /* Number of trailing lines to record */
+ int last_outed; /* Number of lines since last output */
+ bool doctx; /* Printing context? */
+ bool printmatch; /* Printing matches? */
+ bool same_file; /* Same file as previously printed? */
};
+static void procmatch_match(struct mprintc *mc, struct parsec *pc);
+static void procmatch_nomatch(struct mprintc *mc, struct parsec *pc);
+static bool procmatches(struct mprintc *mc, struct parsec *pc, bool matched);
#ifdef WITH_INTERNAL_NOSPEC
static int litexec(const struct pat *pat, const char *string,
size_t nmatch, regmatch_t pmatch[]);
#endif
-static int procline(struct parsec *pc);
+static bool procline(struct parsec *pc);
static void printline(struct parsec *pc, int sep);
static void printline_metadata(struct str *line, int sep);
@@ -94,13 +95,12 @@ file_matching(const char *fname)
for (unsigned int i = 0; i < fpatterns; ++i) {
if (fnmatch(fpattern[i].pat, fname, 0) == 0 ||
- fnmatch(fpattern[i].pat, fname_base, 0) == 0) {
- if (fpattern[i].mode == EXCL_PAT) {
- ret = false;
- break;
- } else
- ret = true;
- }
+ fnmatch(fpattern[i].pat, fname_base, 0) == 0)
+ /*
+ * The last pattern matched wins exclusion/inclusion
+ * rights, so we can't reasonably bail out early here.
+ */
+ ret = (fpattern[i].mode != EXCL_PAT);
}
free(fname_buf);
return (ret);
@@ -114,13 +114,12 @@ dir_matching(const char *dname)
ret = dinclude ? false : true;
for (unsigned int i = 0; i < dpatterns; ++i) {
- if (dname != NULL &&
- fnmatch(dpattern[i].pat, dname, 0) == 0) {
- if (dpattern[i].mode == EXCL_PAT)
- return (false);
- else
- ret = true;
- }
+ if (dname != NULL && fnmatch(dpattern[i].pat, dname, 0) == 0)
+ /*
+ * The last pattern matched wins exclusion/inclusion
+ * rights, so we can't reasonably bail out early here.
+ */
+ ret = (dpattern[i].mode != EXCL_PAT);
}
return (ret);
}
@@ -129,17 +128,18 @@ dir_matching(const char *dname)
* Processes a directory when a recursive search is performed with
* the -R option. Each appropriate file is passed to procfile().
*/
-int
+bool
grep_tree(char **argv)
{
FTS *fts;
FTSENT *p;
- int c, fts_flags;
- bool ok;
+ int fts_flags;
+ bool matched, ok;
const char *wd[] = { ".", NULL };
- c = fts_flags = 0;
+ matched = false;
+ /* This switch effectively initializes 'fts_flags' */
switch(linkbehave) {
case LINK_EXPLICIT:
fts_flags = FTS_COMFOLLOW;
@@ -149,7 +149,6 @@ grep_tree(char **argv)
break;
default:
fts_flags = FTS_LOGICAL;
-
}
fts_flags |= FTS_NOSTAT | FTS_NOCHDIR;
@@ -178,7 +177,7 @@ grep_tree(char **argv)
case FTS_DC:
/* Print a warning for recursive directory loop */
warnx("warning: %s: recursive directory loop",
- p->fts_path);
+ p->fts_path);
break;
default:
/* Check for file exclusion/inclusion */
@@ -186,44 +185,122 @@ grep_tree(char **argv)
if (fexclude || finclude)
ok &= file_matching(p->fts_path);
- if (ok)
- c += procfile(p->fts_path);
+ if (ok && procfile(p->fts_path))
+ matched = true;
break;
}
}
fts_close(fts);
- return (c);
+ return (matched);
}
+static void
+procmatch_match(struct mprintc *mc, struct parsec *pc)
+{
+
+ if (mc->doctx) {
+ if (!first_match && (!mc->same_file || mc->last_outed > 0))
+ printf("--\n");
+ if (Bflag > 0)
+ printqueue();
+ mc->tail = Aflag;
+ }
+
+ /* Print the matching line, but only if not quiet/binary */
+ if (mc->printmatch) {
+ printline(pc, ':');
+ while (pc->matchidx >= MAX_MATCHES) {
+ /* Reset matchidx and try again */
+ pc->matchidx = 0;
+ if (procline(pc) == !vflag)
+ printline(pc, ':');
+ else
+ break;
+ }
+ first_match = false;
+ mc->same_file = true;
+ mc->last_outed = 0;
+ }
+}
+
+static void
+procmatch_nomatch(struct mprintc *mc, struct parsec *pc)
+{
+
+ /* Deal with any -A context as needed */
+ if (mc->tail > 0) {
+ grep_printline(&pc->ln, '-');
+ mc->tail--;
+ if (Bflag > 0)
+ clearqueue();
+ } else if (Bflag == 0 || (Bflag > 0 && enqueue(&pc->ln)))
+ /*
+ * Enqueue non-matching lines for -B context. If we're not
+ * actually doing -B context or if the enqueue resulted in a
+ * line being rotated out, then go ahead and increment
+ * last_outed to signify a gap between context/match.
+ */
+ ++mc->last_outed;
+}
+
/*
+ * Process any matches in the current parsing context, return a boolean
+ * indicating whether we should halt any further processing or not. 'true' to
+ * continue processing, 'false' to halt.
+ */
+static bool
+procmatches(struct mprintc *mc, struct parsec *pc, bool matched)
+{
+
+ /*
+ * XXX TODO: This should loop over pc->matches and handle things on a
+ * line-by-line basis, setting up a `struct str` as needed.
+ */
+ /* Deal with any -B context or context separators */
+ if (matched) {
+ procmatch_match(mc, pc);
+
+ /* Count the matches if we have a match limit */
+ if (mflag) {
+ /* XXX TODO: Decrement by number of matched lines */
+ mcount -= 1;
+ if (mcount <= 0)
+ return (false);
+ }
+ } else if (mc->doctx)
+ procmatch_nomatch(mc, pc);
+
+ return (true);
+}
+
+/*
* Opens a file and processes it. Each file is processed line-by-line
* passing the lines to procline().
*/
-int
+bool
procfile(const char *fn)
{
struct parsec pc;
- long long tail;
+ struct mprintc mc;
struct file *f;
struct stat sb;
- struct str *ln;
mode_t s;
- int c, last_outed, t;
- bool doctx, printmatch, same_file;
+ int lines;
+ bool line_matched;
if (strcmp(fn, "-") == 0) {
fn = label != NULL ? label : getstr(1);
f = grep_open(NULL);
} else {
- if (!stat(fn, &sb)) {
+ if (stat(fn, &sb) == 0) {
/* Check if we need to process the file */
s = sb.st_mode & S_IFMT;
- if (s == S_IFDIR && dirbehave == DIR_SKIP)
- return (0);
- if ((s == S_IFIFO || s == S_IFCHR || s == S_IFBLK
- || s == S_IFSOCK) && devbehave == DEV_SKIP)
- return (0);
+ if (dirbehave == DIR_SKIP && s == S_IFDIR)
+ return (false);
+ if (devbehave == DEV_SKIP && (s == S_IFIFO ||
+ s == S_IFCHR || s == S_IFBLK || s == S_IFSOCK))
+ return (false);
}
f = grep_open(fn);
}
@@ -231,39 +308,41 @@ procfile(const char *fn)
file_err = true;
if (!sflag)
warn("%s", fn);
- return (0);
+ return (false);
}
- /* Convenience */
- ln = &pc.ln;
- pc.ln.file = grep_malloc(strlen(fn) + 1);
- strcpy(pc.ln.file, fn);
+ pc.ln.file = grep_strdup(fn);
pc.ln.line_no = 0;
pc.ln.len = 0;
pc.ln.boff = 0;
pc.ln.off = -1;
pc.binary = f->binary;
- pc.printed = 0;
- tail = 0;
- last_outed = 0;
- same_file = false;
- doctx = false;
- printmatch = true;
+ pc.cntlines = false;
+ memset(&mc, 0, sizeof(mc));
+ mc.printmatch = true;
if ((pc.binary && binbehave == BINFILE_BIN) || cflag || qflag ||
lflag || Lflag)
- printmatch = false;
- if (printmatch && (Aflag != 0 || Bflag != 0))
- doctx = true;
+ mc.printmatch = false;
+ if (mc.printmatch && (Aflag != 0 || Bflag != 0))
+ mc.doctx = true;
+ if (mc.printmatch && (Aflag != 0 || Bflag != 0 || mflag || nflag))
+ pc.cntlines = true;
mcount = mlimit;
- for (c = 0; c == 0 || !(lflag || qflag); ) {
+ for (lines = 0; lines == 0 || !(lflag || qflag); ) {
+ /*
+ * XXX TODO: We need to revisit this in a chunking world. We're
+ * not going to be doing per-line statistics because of the
+ * overhead involved. procmatches can figure that stuff out as
+ * needed. */
/* Reset per-line statistics */
pc.printed = 0;
pc.matchidx = 0;
pc.lnstart = 0;
pc.ln.boff = 0;
pc.ln.off += pc.ln.len + 1;
- if ((pc.ln.dat = grep_fgetln(f, &pc.ln.len)) == NULL ||
+ /* XXX TODO: Grab a chunk */
+ if ((pc.ln.dat = grep_fgetln(f, &pc)) == NULL ||
pc.ln.len == 0)
break;
@@ -279,59 +358,13 @@ procfile(const char *fn)
return (0);
}
- if ((t = procline(&pc)) == 0)
- ++c;
+ line_matched = procline(&pc) == !vflag;
+ if (line_matched)
+ ++lines;
- /* Deal with any -B context or context separators */
- if (t == 0 && doctx) {
- if (!first_match && (!same_file || last_outed > 0))
- printf("--\n");
- if (Bflag > 0)
- printqueue();
- tail = Aflag;
- }
- /* Print the matching line, but only if not quiet/binary */
- if (t == 0 && printmatch) {
- printline(&pc, ':');
- while (pc.matchidx >= MAX_MATCHES) {
- /* Reset matchidx and try again */
- pc.matchidx = 0;
- if (procline(&pc) == 0)
- printline(&pc, ':');
- else
- break;
- }
- first_match = false;
- same_file = true;
- last_outed = 0;
- }
- if (t != 0 && doctx) {
- /* Deal with any -A context */
- if (tail > 0) {
- grep_printline(&pc.ln, '-');
- tail--;
- if (Bflag > 0)
- clearqueue();
- } else {
- /*
- * Enqueue non-matching lines for -B context.
- * If we're not actually doing -B context or if
- * the enqueue resulted in a line being rotated
- * out, then go ahead and increment last_outed
- * to signify a gap between context/match.
- */
- if (Bflag == 0 || (Bflag > 0 && enqueue(ln)))
- ++last_outed;
- }
- }
-
- /* Count the matches if we have a match limit */
- if (t == 0 && mflag) {
- --mcount;
- if (mflag && mcount <= 0)
- break;
- }
-
+ /* Halt processing if we hit our match limit */
+ if (!procmatches(&mc, &pc, line_matched))
+ break;
}
if (Bflag > 0)
clearqueue();
@@ -340,19 +373,19 @@ procfile(const char *fn)
if (cflag) {
if (!hflag)
printf("%s:", pc.ln.file);
- printf("%u\n", c);
+ printf("%u\n", lines);
}
*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
More information about the svn-src-stable-11
mailing list