The basics of file compression:
Almost everybody at this day an age has heard of file compression. Downloaded files from the web is familiar with ZIP and RAR formats. Anyone who edits media and files know that compression is necessary to share information like files, images, music and videos over the web without using up all your bandwidth.
It is not magic, but the result of hard work by many ingenious people.
There are 2 types of compression – lossless and lossy.
Just a warning – I’m going to oversimplify things here in an attempt to make this readable by non-math majors. Check out the linked-to Wikipedia articles for more depth, and Wikipedia’s sources for even more.
Lossless compression basically works by removing redundancy. What does that mean? Let’s simplify things. This stack of bricks will represent our data:
how does file compression work
As you can see we’ve got two red bricks, five yellow and three blue. The simplest way to represent this is as you see above: the bricks themselves. But it’s not the only way I can represent this. I could also do this:
how file compression works
In the above image you can see the exact same information – two red, five yellow and three blue – but it takes up significantly less space. I’ve represented redundant bricks using numbers, meaning I need only three bricks to represent ten.
This gives you a rough idea how lossless compression is possible. Information that’s redundant is replaced with instructions telling the computer how much identical data repeats. Another simplified example:
Can be “compressed” to:
This is only one method of lossless compression, of course, but it points to how this is possible. Other math tricks are used, but the main thing to remember about lossless compression is that while space is temporarily saved, it is possible to reconstruct the original file entirely from the compressed one. If you see three bricks with numbers you know exactly how to make the stack. No information is lost, just as the name lossless implies.
Programs like WinZip are based on lossless compression. They remove this redundant information when you compress (or “zip”) the file and restore it when you uncompress (or “unzip”). Nothing is lost.
In the image world, PNG files also use lossless compression. This is why they offer a smaller file size for images with lots of uniform space: that redundant information is represented using instructions.
Of course, this is all an oversimplification, but it gets the basic point across. Read more about lossless compression on Wikipedia, if you’re interested.
Of course, there’s only so much you can accomplish using only lossless methods. Happily they’re not the only option: you can also simply remove information. This is called lossy compression, and it’s not as crazy as it sounds; in fact, you probably have many files on your computer made using lossy compression.
An MP3, for example. If you’re like most people your computer stores thousands of them for you, but did you know they don’t contain all of the audio information the original recording did? Some sounds, which humans cannot or can barely hear, are removed as part of the compression. The more you compress a file the more information is removed, which is why an overly compressed file will start to sound muddy.
Lossy compression tends to mostly be used for media files – pictures, sound and video. Using lossy compression for a text file would be problematic, as the resulting information would be garbled. It’s not always necessary for media files to include all the information, however.
Another example of lossy compression is the JPEG image. Generally speaking images seen on the web do not need to be as high-quality as images intended for printing. As such, you can remove a lot of redundant information in a web image, even if doing so would look awful printed.
Of course, repeatedly compressing a file using lossy methods decreases the quality – every time you do it more data is lost. Below is a photo I’ve compressed three times to demonstrate this:
how does file compression work
You can see from left to right how the quality decreases. It may not matter, depending on what the image will be used for, and that’s why lossy compression exists.
It’s important to remember that files compressed using lossy methods actually lose data, meaning you cannot recreate the original file from one compressed using lossy methods. It’s obvious when you think about it, but many printing projects have been ruined for lack of understanding this key point.
I’ve really only scratched the surface here, so please: read more about lossy compression on Wikipedia. It’s kind of fascinating.
Compression helped make the web what it is. In the days of dialup compressed images brought photos to our browser, at least not at an acceptable speed. Compressed video makes sites like YouTube possible, and anyone who uses file sharing networks is familiar with ZIP and RAR files.
Was this helpful?
As we value quality over quantity, we have focused our unified I.T. services to Small and Medium businesses only to Arizona specifically in Phoenix, Scottsdale, Glendale Metro areas.
Our technicians are available the very instant you call us; thereby, ensuring no interruption of your usual business operations. In case you can’t access our contact page, our phone support is always available to cater to your calls. Just give us a ring at 480-464-0202