speed of bzip2 versus gzip
Norberto Meijome
freebsd at meijome.net
Sat Jul 21 02:36:04 UTC 2007
On Fri, 20 Jul 2007 18:24:55 -0700
James Long <list at museum.rain.com> wrote:
> On Fri, Jul 20, 2007 at 05:50:20PM -0700, Chuck Swiger wrote:
> > On Jul 20, 2007, at 5:37 PM, Norberto Meijome wrote:
> >>> Is it normal for bzip2 to be significantly slower than gzip?
> >>> If not, where can I look for things that might be causing
> >>> "bzip2 --fast" to take 50-60 times longer to compress a
> >>> (sendmail log) file than gzip?
> >>
> >> i never measured it to see if it is 50-60 times slower, but yes, gzip
> >> blows
> >> bzip2 out of the water on speed. I wanted to use bzip2 to compress
> >> multi-GB
> >> weblog files, but gzip beat it my miles, and bzip2 wasn't THAT much better
> >> @
> >> compressing it to make it worth it.
> >
> > Thanks for the feedback, Norberto.
> >
> > Of course, it all depends on what your priorities are, too-- if what you
> > want is a final tarball which is being mirrored and downloaded frequently,
> > then your goal is to obtain the absolute best compression, and how much CPU
> > --best takes isn't important.
> >
> > Comparing the default (-5 compression?) of gzip to bzip2 would probably be
> > more reasonable if you care about reasonably timely compression.
>
> If I read the man page correctly, bzip2 defaults to --best, which is why
> I compared gzip to bzip2 --fast. With the 1.5G sendmail log, bzip2 --fast
> compresses to just under 10M in about 55 minutes, give or take. bzip2
> --best compresses 1.5G to 1.8M, but takes about 2.25 hours. gzip
> compresses almost as well (with 3% or so) as --fast, but does it in 1
> minute instead of 55 on a dual P-III 1.4GHz (but of course, using only
> one CPU).
I don't have the exact numbers at hand, but yes, they were definitely in that range of crazy comparison.
BTW, i always compared using default bzip2 and gzip -9, because i was interested in making gzip work harder at achieving some more compression.
I ran some short tests... both systems are not doing much more than this simple test
Comparison using a 249 MB Apache web log file
First is my laptop running FreeBSD, single CPU.
2nd is a server with the same hardware as I had compressed those multi-GB log files in 2005...this one is running CentOS/64 bit. . I know, not Freebsd, but to see if there's a difference in the OS...
Both boxes have enough RAM to hold all the file in memory.
The numbers are quite similar, even given the difference in hardware...it may speak very well of FreeBSD speeds ;)
Compression ratios are the same in both Linux + FreeBSD, and Bzip2 compresses >THIS FILE< about 50% more than gzip -9
------------------
CPU: Intel(R) Pentium(R) M processor 2.00GHz (1995.02-MHz 686-class CPU)
1.5 GB RAM
$ uname -a
FreeBSD ayiin.octantis.com.au 6.2-STABLE FreeBSD 6.2-STABLE #12: Fri Jul 13 17:45:09 EST 2007 root at ayiin.octantis.com.au:/usr/obj/usr/src/sys/AYIIN i386
$ time gzip -9 20070604-desktop.log
real 0m13.373s
user 0m10.398s
sys 0m0.257s
[betom at ayiin] [Sat Jul 21 12:27:14 2007]
/usr/home/betom/Desktop
$ ls -lh 20070604-desktop.log.gz
-rw-r--r-- 1 betom betom 11M Jul 21 12:17 20070604-desktop.log.gz
$ time gunzip ./20070604-desktop.log.gz
real 0m13.926s
user 0m1.455s
sys 0m0.525s
$ time bzip2 20070604-desktop.log
real 4m2.662s
user 3m21.184s
sys 0m0.321s
$ ls -lh 20070604-desktop.log.bz2
-rw-r--r-- 1 betom betom 5.2M Jul 21 12:17 20070604-desktop.log.bz2
$ time bunzip2 20070604-desktop.log.bz2
real 0m18.650s
user 0m13.922s
sys 0m0.794s
==================================================
Box 2
# uname -a
Linux cerberus.octantis.com.au. 2.6.18-8.1.4.el5.centos.plus #1 SMP Sun May 20 10:53:21 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
CPU : 2 x model name : AMD Opteron(tm) Processor 250
stepping : 10
cpu MHz : 2400.000
4 GB RAM
[root at cerberus] [Sat 21 Jul 2007 12:22:39 PM EST]
~
# time gzip -9 20070604-desktop.log
real 0m7.818s
user 0m7.343s
sys 0m0.332s
[root at cerberus] [Sat 21 Jul 2007 12:22:56 PM EST]
~
# ls -lh 20070604-desktop.log.gz
-rw-r--r-- 1 numard numard 11M Jul 21 12:09 20070604-desktop.log.gz
# time gunzip 20070604-desktop.log.gz
real 0m2.502s
user 0m1.049s
sys 0m1.044s
# time bzip2 20070604-desktop.log
real 3m22.587s
user 3m17.566s
sys 0m1.741s
[root at cerberus] [Sat 21 Jul 2007 12:29:19 PM EST]
~
# ls -lh 20070604-desktop.log.bz2
-rw-r--r-- 1 numard numard 5.2M Jul 21 12:09 20070604-desktop.log.bz2
# time bunzip2 20070604-desktop.log.bz2
real 0m17.544s
user 0m15.261s
sys 0m1.435s
_________________________
{Beto|Norberto|Numard} Meijome
"They redundantly repeated themselves over and over again incessantly without end ad infinitum"
ibid.
I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
More information about the freebsd-questions
mailing list