«

»

Dec 14

7-Zip: Compression vs. Speed

image Earlier this year I had discovered the joys of running 7-Zip with the multi threaded support. At the time, I saw some incredible time savings. Since then, however, the files being compressed have gotten much larger and I’m back to some inordinately long compression runs.

As a quick refresher, 7-Zip is an open source file archiver that I’ve found to be very flexible. You might compare it to WinZip, but I think 7-Zip typically brings more options to the table. I like the price too. :-)

This past weekend I needed to compress a larger (over 5 GB) database backup for a business partner. I ran 7-Zip with my normal parms and after 30 minutes realized I was looking at over 3 hours for completion. Yuck!

Here’s the command:

7z a -mx=9 -mmt=on -ppasskey backup.zip backup.bak

  • mx=9 is for highest compression (I have to ship these files around on a VPN. That VPN is on a data plan that includes overages, so keeping the files as small as possible is critical
  • mmt=on turns on the multi threading
  • -ppasskey encrypts the archive with passkey as the password

This command appears to use two threads. On my quad-core server, it never goes above 50% utilization.

Over 3 hours was just too long to wait, so I killed the backup and went digging for more options.

A quick look in the help file turned up the fact that the bzip2 compression option would offer 4 threads, so I tried it.

7z a -tbzip2 -mx=9 -mmt=on -ppasskey backup.zip backup.bak

Incredible! Not only would it peg all 4 cores on the server, but it finished in about 30 minutes. Oh, and the resulting archive was almost 200MB smaller. Super!

I immediately updated all my scheduled compression jobs to use this new-found feature.

The next morning I was checking the compressed archives and made a horrible discovery — sure, they were being built faster but they weren’t encrypted. I’ve dug around the forums and the documentation and can’t confirm this, but there must be a length limit to the passkey when using bzip2 (my key is around 60 chars long).

Also, I noticed that in some cases, the smaller files being archived were resulting in larger (than before) archives. That’s not cool either.

With 7-Zip, it appears that bzip2 is the only compression type that will go more than 1 or 2 threads, so I’m a bit stumped as to what to try next. Over 3 hours a night is a bit extreme now that I’ve had a taste of 30 minutes!

Possibly Related posts:

  1. The Wonders of 7-Zip
  2. Live SkyDrive — Multiple File Uploading
  3. Testing New WordPress Versions Part 2: Data

11 comments

  1. Ken

    Have you looked at Winrar? It can open pretty much all of the compressed files.

  2. Chris Kasten

    Good question, Ken. It’s been years since I last used Winrar. Is it still shareware or did they go open source?

  3. mingus

    Winrar is a wrapper for rar (and other algorithms), AFAIK still shareware. 7z claims will compress ~10% better (depending on data), which I’ve confirmed in my own tests. There is an open source vsn of rar that runs on Linux, can’t say if there is such on W$.

  4. Chris Kasten

    Mingus – thanks for the info (and for stopping by). Much appreciated :)

  5. tokinger

    If you really care about efficiency try frearc, with a single core is as fast as 7zip and gives even better compression ratio.

    http://freearc.sourceforge.net/

  6. markvt

    elena.tar: 53.0 MB (55,613,440 bytes)
    Packed with bzip2 (ultra) : 10.8 MB (11,413,925 bytes)
    Packed with 7zip (ultra) : 7.32 MB (7,676,983 bytes)

    Bzip2 is faster but the compression ratio is not as good as the 7zip, i’ll stick with 7zip format (though it’s longer but it saves me diskspace ;)

  7. Piotrek

    Bzip2 compression can be used in zip or 7z archives, so one can use AES encryption with bzip2.

  8. trancepy

    Just fyi – I tried this on a 32-core server, and it maxed out all 32 cores compressing a file – incredibly fast. Thanks for this!

  9. Vikram Sridharan

    Thanks for the info, written an article on how to compress really large files on a server using stuff that you have mentioned. Have linked your article as well

    http://piglings.blogspot.com/2010/09/compress-large-files-on-windows-server.html

    Cheers !

  10. djamu

    you should use LZMA2 ( aka XZ ) ..
    this newer library supports 4/8/16/… threads while LZMA can only do 2 …

    for GUI users > it is available in the compressortype dropdown menu, and so is the fastest / best compressor around…

  11. miro

    thank you djamu, I was wondering why I could only select 2 / 8 threads. I had 10gb to backup on 4.7gb dvds so all I really wanted to do was split it up, and it only took 10 minutes!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>