File Compression Explained — Why Files Get Smaller (And Sometimes Can't)
How file compression actually works. Lossless vs lossy, ZIP vs RAR vs 7z, why some files compress 90% and others barely budge.
The Magic Trick That Isn't Magic
File compression feels like magic: a 100 MB file becomes 10 MB, and when you decompress it, everything's still there. How?
The short answer: patterns. Every file has patterns — repeated sequences, predictable structures, redundant information. Compression algorithms find these patterns and replace them with shorter representations.
Think of it like this: instead of writing "the the the the the," you write "5×the." Same information, fewer characters. Real compression is vastly more sophisticated, but the principle is the same.
Lossless vs Lossy: The Fundamental Split
Lossless Compression
Every single bit is preserved. Decompress the file, and you get the exact original. Used for:- ZIP, RAR, 7z, GZIP archives
- PNG images
- FLAC audio
- Text, code, databases — anything where every bit matters
Lossy Compression
Some data is permanently discarded to achieve smaller files. The discarded data is chosen to minimize perceptible quality loss. Used for:- JPG images (discards visual details humans can't easily see)
- MP3/AAC audio (discards sounds masked by louder sounds)
- H.264/H.265 video (discards inter-frame redundancy)
Why Some Files Compress Well and Others Don't
| File Type | Typical ZIP Compression | Why |
|---|---|---|
| Plain text (.txt, .csv) | 60-90% smaller | Highly repetitive patterns |
| Word documents (.docx) | 0-5% smaller | Already ZIP-compressed internally |
| Source code | 70-85% smaller | Repetitive keywords and structure |
| BMP images | 80-95% smaller | Tons of redundancy |
| PNG images | 0-2% smaller | Already compressed |
| JPG images | 0-3% smaller | Already compressed |
| MP3 audio | 0-2% smaller | Already compressed |
| MP4 video | 0-1% smaller | Already compressed |
| Database files | 50-80% smaller | Structured, repetitive data |
| Executable (.exe) | 20-40% smaller | Some patterns, less than text |
This is why zipping a folder of photos barely reduces its size — the photos are already compressed.
ZIP vs RAR vs 7z vs GZIP
| Format | Compression Ratio | Speed | Compatibility | Encryption |
|---|---|---|---|---|
| ZIP | Good | Fast | Universal | AES-256 |
| RAR | Very Good | Medium | Needs WinRAR/7-Zip | AES-256 |
| 7z (LZMA2) | Excellent | Slow | Needs 7-Zip | AES-256 |
| GZIP | Good | Fast | Linux/web standard | No |
| Brotli | Very Good | Slow | Web (HTTP) | No |
| Zstandard | Excellent | Fast | Growing (Facebook) | No |
Compression Levels
Most tools let you choose a compression level (1-9 or similar). Higher levels don't always help:
| Level | Speed | Size Reduction | Best For |
|---|---|---|---|
| 1 (fastest) | Very fast | Minimal | Quick archiving, temporary files |
| 5-6 (default) | Moderate | Good | General use |
| 9 (maximum) | Very slow | Slightly better than 6 | Long-term archival |
Real-World Compression Scenarios
Sending a folder of documents via email: ZIP the folder → often 50-70% smaller. MyPDF's Create ZIP handles this quickly online. Archiving project files: 7z with LZMA2 compression → smallest possible archive. Use 7-Zip (free desktop software). Reducing a PDF for email: Don't ZIP it — use PDF compression instead, which re-compresses images inside the PDF. Much more effective than generic ZIP compression on an already-compressed PDF. Backing up a code repository: GZIP or ZIP → excellent compression on source code. Git already stores objects compressed, so compressing a .git folder yields minimal additional savings.The Diminishing Returns of Compression
A common misconception: "If I compress it twice, it'll be even smaller." Wrong. Compressing a ZIP file produces a ZIP that's the same size (or slightly larger, due to archive overhead). The patterns have already been found and replaced — there's nothing left to optimize.
This applies to all compression: MP3 of an MP3 doesn't get smaller. JPG of a JPG gets worse, not smaller. ZIP of a ZIP adds overhead.
Frequently Asked Questions
Does compression speed up file transfers?
Yes, if the compression ratio is significant. A 100 MB text file compressed to 10 MB transfers 10x faster (minus compression/decompression time). A 100 MB JPG compressed to 99 MB? Not worth it.Can corrupted compressed files be recovered?
Partially. ZIP files can sometimes recover uncorrupted portions. RAR files with "recovery records" (a RAR-specific feature) can reconstruct damaged sections. 7z files are harder to recover from corruption.What's the most compressed a file can get?
It depends entirely on the content. A file of all zeros compresses to nearly nothing. Random data can't be compressed at all (by mathematical proof). Real files fall somewhere between.Related Tools
- Compress PDF — Intelligent PDF-specific compression
- Create ZIP — Create ZIP archives online
- Extract ZIP — Extract ZIP files online
- Compress Image — Lossy image compression