Friday, January 24, 2014

Compression curiosities

Compression curiosities

This page will list several compression curiosities. 

Extremely high compression ratio

What is the best compressor to get really extreme compression?' is a question often asked on the internet. The achieved compression ratio of course depends on the quality of the compressor used, but the type of data that is being compressed is much more important. To show this I created a tiny 115 byte rar file. When you decompress it, it will turn into a textfile of almost 5 MB (a compression ratio of 99.998%). Here it is: test.rar. Note: Turtle 0.07 compresses the same file to 49 bytes, Hook09c (with switches 3 1 1 1) to 36 bytes and UHBC (with switches -b128m -m3) to 24 bytes!!.

Compressing a file makes it smaller?

Most of the time yes, but not always. A file containing pure random data is not compressible at all (the resulting archive might even be bigger!). A compressor works by finding repeating patterns inside the file it is compressing, random data does not have these patterns, so it will not compress. The same goes trying to compress precompressed data like RAR and 7z-files. Also certain movie formats like mpeg and avi are already highly compressed, which make those files very difficult to compress even further. A pure random file can be found here: a.bin. Try to compress it with your favourite compressor/archiver and look at the resulting file size...

High and low compression with one file?

Does there exist a file which is extremely compressable by one archiver, but almost incompressable by another one?. I thought the answer was no, but Nimda Admin did sent me a file with these very strange properties. Please download and extract the rarred file strange.rar and try to compress it with (win)rar and (win)zip. You will see compression is RAR is extremely good, but compression in ZIP is almost 0%!. Just when you think you understand compression, someone sends you this file.

Update: Several people analysed the file and concluded it's optimally compressed using double delta compression.

A compressed file decompressing to itself

There have been some discussions in the past about whether it's possible to have a file in gzip format or another compression format that decompresses to itself. Someone called Caspian Maclean, created such a file a couple of years ago. Here it is: selfgz.gz. A real master piece.


No comments:

Post a Comment