I just learned about pbzip2, which lets your multicore computer use more than one core when using the bzip2 compression algorithm.
On my Mac Pro at work, I installed it with MacPorts (`sudo port install pbzip2`). It is this kind of awesome:
$ ls -lh original.tar -rw-r--r-- 1 jmcmurry staff 2.4G Feb 4 13:47 original.tar $ time bzip2 -k -v original.tar original.tar: 36.215:1, 0.221 bits/byte, 97.24% saved, 2604288000 in, 71911733 out. real 13m3.313s user 12m50.536s sys 0m3.773s $ mv original.tar.bz2 bzip2.tar.bz2 $ time pbzip2 -k -v original.tar Parallel BZIP2 v1.0.5 - by: Jeff Gilchrist [http://compression.ca] [Jan. 08, 2009] (uses libbzip2 by Julian Seward) # CPUs: 8 BWT Block Size: 900k File Block Size: 900k ------------------------------------------- File #: 1 of 1 Input Name: original.tar Output Name: original.tar.bz2 Input Size: 2604288000 bytes Compressing data... ------------------------------------------- Wall Clock: 119.369207 seconds real 1m59.612s user 14m39.090s sys 0m44.840s
Sweet. 6.57x faster by adding a “p” to my command line.
The resulting compressed .bz2 files aren’t exactly the same according to md5 (the pbzip2 output is a little larger, which makes sense due to the splitting of the work), but when they decompress, they’re both identical to the original .tar file.
See also: mgzip.