March 24, 20265 min read

How to Create TAR and TAR.GZ Archives — The Linux Standard

Understand TAR and TAR.GZ archives, when to use them instead of ZIP, and how to create them for Linux deployments, source code distribution, and backups.

tar tar.gz archive compression linux
Ad 336x280

If you work with Linux servers, open-source software, or any kind of deployment pipeline, you've seen .tar.gz files. They're everywhere. And yet, most people who use them daily don't really understand what they are or why they exist as two separate layers.

Let's fix that.

TAR Is Not Compression

This is the most common misunderstanding. TAR stands for Tape Archive. It was created in 1979 for writing files sequentially to magnetic tape drives. All it does is concatenate multiple files and directories into a single stream, preserving file permissions, ownership, timestamps, and directory structure.

A .tar file is not compressed. It's roughly the same size as the original files combined — sometimes slightly larger because of header metadata.

Compression is a separate step. You pipe the TAR stream through a compression algorithm:

  • TAR + gzip = .tar.gz or .tgz — The classic. Fast compression, universally supported.
  • TAR + bzip2 = .tar.bz2 — Better compression ratio, slower. Less common now.
  • TAR + xz = .tar.xz — Best compression ratio, slowest. Used by many Linux distros for package distribution.
  • TAR + zstd = .tar.zst — Modern. Fast like gzip, compresses like bzip2. Growing adoption.
This two-layer design is distinctly Unix philosophy: each tool does one thing well. TAR bundles, gzip compresses. ZIP does both in one pass, which is simpler but less flexible.

TAR vs ZIP — When Each Makes Sense

This comes up constantly, so here's the honest comparison:

FactorTAR.GZZIP
Unix permissions preservedYesPartially (depends on tool)
Symlinks preservedYesRarely
Native on Linux/MacYesMostly (zip/unzip packages)
Native on WindowsNo (needs 7-Zip or similar)Yes (built-in since XP)
Compression ratioBetter (compresses as one stream)Slightly worse (each file compressed individually)
Random file accessNo (must decompress sequentially)Yes (can extract single files)
Streaming friendlyYesNo (needs central directory at end)
The compression ratio difference is real and worth understanding. When gzip compresses a TAR stream, it sees all files as one continuous block of data. Patterns that repeat across files get compressed once. ZIP compresses each file independently, so cross-file redundancy goes uncompressed.

For source code repositories — where files are full of repeated keywords and patterns — TAR.GZ typically achieves 10-20% better compression than ZIP on the same content.

When You Actually Need TAR

Deploying to Linux servers. If you're shipping code to a VPS, Docker container, or CI/CD pipeline, TAR.GZ is the native format. tar xzf deploy.tar.gz works on every Linux system without installing anything. Source code distribution. Every GitHub release, every npm package source, every open-source project distributes as .tar.gz. It's convention, and it preserves the Unix permissions your build scripts depend on. Backups with metadata. TAR preserves ownership (uid/gid), permissions (including setuid/setgid), timestamps, and symlinks. ZIP loses most of this. If you're backing up a Linux filesystem, TAR is the correct choice. Large dataset archives. Scientific datasets, log archives, database dumps — these are typically TAR.GZ because the streaming nature means you can create them without needing enough disk space for the uncompressed data plus the archive simultaneously.

Creating TAR Archives with MyPDF

MyPDF's TAR creator lets you build TAR and TAR.GZ archives directly in the browser. Upload your files, choose whether you want plain TAR or gzip compression, and download the result.

This is particularly handy when you're on Windows and need to create a TAR.GZ for a Linux deployment. Windows doesn't have native TAR creation (well, Windows 10+ has a basic tar command, but it's limited). Rather than installing WSL or 7-Zip, you can do it in the browser.

Compression Level Trade-offs

Gzip supports compression levels 1-9:

  • Level 1 — Fastest, least compression. Good for large files you need to archive quickly.
  • Level 6 — The default. Reasonable speed, good compression.
  • Level 9 — Slowest, best compression. Marginal improvement over 6 for most data.
In practice, the difference between level 6 and level 9 is usually 2-5% smaller file size for 3-5x longer compression time. Not worth it for most uses. Level 6 is the default for a reason.

The Naming Convention

You'll see these extensions used interchangeably:

  • .tar.gz and .tgz — identical format, different naming convention
  • .tar.bz2 and .tbz2 — same deal
  • .tar.xz and .txz — same deal
The short forms (.tgz, .tbz2, .txz) date back to when filesystems had 8.3 filename limits. Both are correct. I personally prefer the explicit .tar.gz because it makes the two-layer nature obvious to people who might not know what .tgz means.

Extracting TAR Archives

On Linux and Mac, it's one command: tar xzf archive.tar.gz. On Windows, 7-Zip handles TAR.GZ natively (right-click, extract). IrfanView won't help you here — this is an archive format, not an image format.

MyPDF's archive extractor also works in the browser for those who want a quick extraction without installing anything.
Ad 728x90