How to Create TAR and TAR.GZ Archives — The Linux Standard
Understand TAR and TAR.GZ archives, when to use them instead of ZIP, and how to create them for Linux deployments, source code distribution, and backups.
If you work with Linux servers, open-source software, or any kind of deployment pipeline, you've seen .tar.gz files. They're everywhere. And yet, most people who use them daily don't really understand what they are or why they exist as two separate layers.
Let's fix that.
TAR Is Not Compression
This is the most common misunderstanding. TAR stands for Tape Archive. It was created in 1979 for writing files sequentially to magnetic tape drives. All it does is concatenate multiple files and directories into a single stream, preserving file permissions, ownership, timestamps, and directory structure.
A .tar file is not compressed. It's roughly the same size as the original files combined — sometimes slightly larger because of header metadata.
Compression is a separate step. You pipe the TAR stream through a compression algorithm:
- TAR + gzip =
.tar.gzor.tgz— The classic. Fast compression, universally supported. - TAR + bzip2 =
.tar.bz2— Better compression ratio, slower. Less common now. - TAR + xz =
.tar.xz— Best compression ratio, slowest. Used by many Linux distros for package distribution. - TAR + zstd =
.tar.zst— Modern. Fast like gzip, compresses like bzip2. Growing adoption.
TAR vs ZIP — When Each Makes Sense
This comes up constantly, so here's the honest comparison:
| Factor | TAR.GZ | ZIP |
|---|---|---|
| Unix permissions preserved | Yes | Partially (depends on tool) |
| Symlinks preserved | Yes | Rarely |
| Native on Linux/Mac | Yes | Mostly (zip/unzip packages) |
| Native on Windows | No (needs 7-Zip or similar) | Yes (built-in since XP) |
| Compression ratio | Better (compresses as one stream) | Slightly worse (each file compressed individually) |
| Random file access | No (must decompress sequentially) | Yes (can extract single files) |
| Streaming friendly | Yes | No (needs central directory at end) |
For source code repositories — where files are full of repeated keywords and patterns — TAR.GZ typically achieves 10-20% better compression than ZIP on the same content.
When You Actually Need TAR
Deploying to Linux servers. If you're shipping code to a VPS, Docker container, or CI/CD pipeline, TAR.GZ is the native format.tar xzf deploy.tar.gz works on every Linux system without installing anything.
Source code distribution. Every GitHub release, every npm package source, every open-source project distributes as .tar.gz. It's convention, and it preserves the Unix permissions your build scripts depend on.
Backups with metadata. TAR preserves ownership (uid/gid), permissions (including setuid/setgid), timestamps, and symlinks. ZIP loses most of this. If you're backing up a Linux filesystem, TAR is the correct choice.
Large dataset archives. Scientific datasets, log archives, database dumps — these are typically TAR.GZ because the streaming nature means you can create them without needing enough disk space for the uncompressed data plus the archive simultaneously.
Creating TAR Archives with MyPDF
MyPDF's TAR creator lets you build TAR and TAR.GZ archives directly in the browser. Upload your files, choose whether you want plain TAR or gzip compression, and download the result.This is particularly handy when you're on Windows and need to create a TAR.GZ for a Linux deployment. Windows doesn't have native TAR creation (well, Windows 10+ has a basic tar command, but it's limited). Rather than installing WSL or 7-Zip, you can do it in the browser.
Compression Level Trade-offs
Gzip supports compression levels 1-9:
- Level 1 — Fastest, least compression. Good for large files you need to archive quickly.
- Level 6 — The default. Reasonable speed, good compression.
- Level 9 — Slowest, best compression. Marginal improvement over 6 for most data.
The Naming Convention
You'll see these extensions used interchangeably:
.tar.gzand.tgz— identical format, different naming convention.tar.bz2and.tbz2— same deal.tar.xzand.txz— same deal
.tgz, .tbz2, .txz) date back to when filesystems had 8.3 filename limits. Both are correct. I personally prefer the explicit .tar.gz because it makes the two-layer nature obvious to people who might not know what .tgz means.
Extracting TAR Archives
On Linux and Mac, it's one command: tar xzf archive.tar.gz. On Windows, 7-Zip handles TAR.GZ natively (right-click, extract). IrfanView won't help you here — this is an archive format, not an image format.
Related Tools
- Create TAR Archive — Build TAR and TAR.GZ archives in the browser
- Extract TAR Archive — Decompress TAR.GZ files online
- Create ZIP Archive — Build ZIP archives for cross-platform sharing
- Extract ZIP Archive — Unzip files online