Issue with grep -i (on i386 only?)
Mel Flynn
mel.flynn+fbsd.hackers at mailing.thruhere.net
Tue Nov 3 20:22:33 UTC 2009
Hi,
attached a little test script for grep's -i performance. I tried a few
different machines and the 64-bit 7.2 machine I could steal doesn't seem to be
affected and out performs pcregrep.
On i386 machines, grep -i is significantly slower:
i386, 7.2-STABLE of Sep 8, load averages: 0.00, 0.02, 0.00,
Mem: 336M Active, 442M Inact, 217M Wired, 38M Cache, 112M Buf, 198M Free
dev.cpu.0.freq: 2992 (Intel P-IV HTT enabled)
16Meg file result:
=>>> 16777216
=>>> fgrep
0.04 real 0.02 user 0.01 sys
0.04 real 0.03 user 0.01 sys
=>>> pcregrep
0.21 real 0.19 user 0.02 sys
0.21 real 0.20 user 0.00 sys
=>>> grep
0.04 real 0.02 user 0.01 sys << not -i
3.64 real 3.61 user 0.01 sys << -i
i386, 8.0-RC1 FreeBSD 8.0-RC1 #15 r197337M, load averages: 1.61, 1.35, 1.12
Mem: 920M Active, 87M Inact, 215M Wired, 69M Cache, 112M Buf, 195M Free
dev.cpu.0.freq: 1733 (Intel dual core laptop)
16Meg file result:
=>>> 16777216
=>>> fgrep
0.04 real 0.02 user 0.01 sys
0.05 real 0.04 user 0.00 sys
=>>> pcregrep
0.26 real 0.23 user 0.01 sys
0.29 real 0.24 user 0.00 sys
=>>> grep
0.04 real 0.04 user 0.00 sys
4.73 real 4.15 user 0.01 sys
amd64, 7.2-RELEASE-p4 #1 r198384M, load averages: 0.00, 0.00, 0.00
Mem: 115M Active, 182M Inact, 264M Wired, 101M Cache, 213M Buf, 1311M Free
CPU: Dual-Core AMD Opteron(tm) Processor 2210 (1800.08-MHz K8-class CPU)
64Meg file result:
=>>> 67108864
=>>> fgrep
0.18 real 0.13 user 0.04 sys
0.19 real 0.17 user 0.02 sys
=>>> pcregrep
0.89 real 0.85 user 0.03 sys
0.98 real 0.92 user 0.06 sys
=>>> grep
0.18 real 0.16 user 0.01 sys
0.19 real 0.16 user 0.03 sys
So on the laptop I modified the testscript as it is attached now and while
there is still a significant delay, the wallclock time is less then half, when
the expression is rewritten with the same meaning:
=>>> 16777216
=>>> fgrep
0.04 real 0.03 user 0.00 sys
0.05 real 0.03 user 0.01 sys
0.02 real 0.00 user 0.00 sys
=>>> pcregrep
0.26 real 0.21 user 0.02 sys
0.26 real 0.22 user 0.02 sys
0.44 real 0.35 user 0.01 sys
=>>> grep
0.04 real 0.04 user 0.00 sys
4.45 real 4.15 user 0.01 sys
2.00 real 1.81 user 0.00 sys <-- [fF][Oo][Oo]
So it looks to me that, while there is a problem with case insensitive
comparison, just rewriting the expression is an optimization grep could
perform.
Either way, with the new text tools being written (done?) is this problem
being attacked, not fixable due to specifications or not considered an issue?
Any PR's needed / I missed? Patches to try?
[And it just occured to me bsdgrep is in ports]:
=>>> bsdgrep
0.93 real 0.74 user 0.00 sys
4.80 real 4.33 user 0.02 sys
4.97 real 4.34 user 0.01 sys
So here the optimization does not fly.
--
Mel
-------------- next part --------------
#!/bin/sh
# vim: ts=4 sw=4 noet tw=78 ai
PCREGREP=`which pcregrep`
BSDGREP=`which bsdgrep`
[ -n ${PCREGREP} ] && PCREGREP=`basename ${PCREGREP}`
[ -n ${BSDGREP} ] && BSDGREP=`basename ${BSDGREP}`
me=`basename $0`
BYTES="1048576 2097152 4194304 8388608 16777216"
if [ ! -x /usr/bin/jot ]; then
echo "Need jot"
exit 1
fi
if [ ! -x /usr/bin/rs ]; then
echo "Need rs"
exit 1
fi
for b in ${BYTES}; do
TMPFILE=`mktemp -t ${me}`
if [ ! -f ${TMPFILE} ]; then
echo Can\'t create tmp files in ${TMPDIR:="/tmp"}
exit 2
fi
jot -r -c ${b} a z |rs -g 0 20 > ${TMPFILE}
echo "=>>> ${b}"
for prog in fgrep ${PCREGREP} ${BSDGREP} grep ; do
echo " =>>> ${prog}"
/usr/bin/time ${prog} foo ${TMPFILE} >/dev/null
/usr/bin/time ${prog} -i foo ${TMPFILE} >/dev/null
/usr/bin/time ${prog} '[fF][Oo][Oo]' ${TMPFILE} >/dev/null
done
rm ${TMPFILE}
done
More information about the freebsd-hackers
mailing list