More Dabbling with File Compression Types

calendar Posted on December 15, 2007   comments No Comments

Compression Continuing my research into a fast and effective compression option for large files (see part 1), I’ve stumbled over some surprising numbers. My hope is this stuff will help me at a later date when I revisit this topic — and hopefully it’ll help someone else too. Since I’m a sharer, here is where I am at so far.
(Note: tables and numbers ahead — you’ve been warned!)

The first file I compress tends to be around 380 - 400 MB. Not huge, but big enough to warrant some compression before copying down our VPN between the office and the data center.

There are more variables involved than seem readily apparent. 7-zip not only supports multiple compression types, but for each type there can be multiple compression levels. And that seems to be where I’ve gone awry in the past…

Without further ado, let’s look at some of the timings, shall we?

Compression Method Compression Flag Compressed Size Time to Compress
ZIP 9 (max) 52 M 12 - 13 minutes
BZIP2* 9 (max) 45 M 3 minutes
7Z default (5) 31.9 M 1:40 (MI:SS)
ZIP default (5) 56 M 1:05

*bzip2 w/ crypto appears to not work with my key.

You see those numbers fall? For the last year or so I’ve been running it as the first row in that table — max compression and ZIP compression type. Right away, I saw that I could speed things up just by turning off max compression. It doesn’t seem to be buying me a whole lot. No bang for that buck.

Second, I changed to the 7z compression type. That’s the bolded row — nice improvement over the previous two rows, eh? Zip type and default compression is even a bit faster, but the resulting file size isn’t as good.

I want to point out that just changing from max compression to default compression with the ZIP type nets an enormous time savings at just a 4 MB cost. I found that very interesting for some geeky reason.

But maybe I should use max compression with 7z type? I tried it that next:

7Z 9 (max) 29.3 M 2:40 (MI:SS)

An extra minute to save a bit over 2 MB? Doesn’t hardly seem worth it (or does it?).

Next, I wondered what the payoff would be on the other file I’m working with. It tends to be around 5.7 GB. That’s taking this compression stuff to a new level! Much more of a pain in the butt to test with too, I might add. But I can infer some timings from looking at past runs and I tried 7z default (bottom row) just for grins:

Compression Method Compression Flag Compressed Size Time to Compress
ZIP 9 (max) 732.5 3:15 (HH:MI)
BZIP2* 9 (max) 585 M 32 minutes
7Z default (5) 443.6 M 25 minutes
7Z 9 (max) 394 M 43 minutes

*bzip2 w/ crypto appears to not work with my key.

Again, the first row is what I’d been running until this week. I can’t help but ask myself why the hell I didn’t look at this sooner!

Look at that final row — there’s where I want to be for the big file. 18 more minutes and 50 MB smaller. Also, two and a half hours faster than before! Excellent.

So… At first bzip2 looked like my answer. Fast and smaller archives, but upon further review those archives seemed to be corrupted when I tried to use my long passkey. Now I’m onboard with the 7z compression type, definitely appears to be a winner — and gives more speed and the smallest archives.

tags Tags: , ,

Related Posts Possibly Related Posts

Comments

Leave a Reply




Have you read the Comments section on the Disclaimer page?

About

Wandering the Internet, looking at all things bright and shiny. Playing with many, writing about some. More …

Recent Posts

Recent Comments: