March 24, 20264 min read

How to Redact a PDF Properly — The Black Rectangle Isn't Enough

Permanently remove sensitive information from PDFs. Why most 'redaction' methods fail, how real redaction works, and avoiding costly data leaks.

redact pdf pdf redaction remove sensitive data pdf privacy data leak prevention
Ad 336x280

The $125 Million Redaction Mistake

In 2019, Paul Manafort's legal team filed a court document with "redacted" sections — black rectangles covering text. Problem: anyone could copy-paste the "hidden" text because the redaction was cosmetic, not real. The supposedly confidential information immediately leaked to every newsroom in America.

This happens more often than you'd think. Law firms, government agencies, hospitals, and businesses regularly "redact" documents by drawing shapes over text or highlighting in black. None of these methods actually remove the data.

What Fake Redaction Looks Like

These methods DO NOT work:

MethodWhy It Fails
Black rectangle drawn on topThe text underneath is still there — copy-paste reveals it
Black highlightSame — the highlight is a layer on top, text exists below
White text on white backgroundSelect-all reveals it; screen readers read it aloud
Cropping the pageOriginal content may exist outside the visible area
Taking a screenshot and re-insertingBetter, but OCR can sometimes recover underlying text
Printing and scanningBest workaround, but tedious and loses quality

What Real Redaction Does

Proper PDF redaction:

  1. Identifies the content to be removed (text, images, or regions)
  2. Destroys the underlying data — the actual character data, not just the visual representation
  3. Replaces the removed area with a solid color (usually black)
  4. Removes hidden metadata, comments, and form data that might contain the redacted information
  5. Flattens the result so no layers or history remain
After real redaction, the data is gone. Not hidden behind a shape. Not buried in a layer. Gone. No tool, no forensics lab, no amount of PDF parsing can recover it.

How to Redact Properly

Online (Free)

MyPDF's Redact PDF tool performs real redaction — select areas or search for text patterns, and the underlying data is permanently removed. Files are auto-deleted from servers after 15 minutes.

Desktop (More Control)

Adobe Acrobat Pro ($23/month): The gold standard. Tools → Redact → Mark for Redaction → Apply Redactions. Also has "Remove Hidden Information" to strip metadata, comments, and hidden layers. PDF-XChange Editor (free tier has some redaction): Mark areas and apply.

The "Poor Man's Redaction"

If you have no redaction tool:
  1. Print the PDF to paper
  2. Use a black marker to cover sensitive text
  3. Scan the marked-up paper back to PDF
This is crude but effective — the digital data is gone because you created a new image from the physical redaction. The downside: you lose text selectability and the document quality degrades.

What to Redact (Checklist)

Information TypeWhy Redact
Social Security numbersIdentity theft
Bank account numbersFinancial fraud
Credit card numbersCard fraud
Home addressesPersonal safety
Phone numbersHarassment, spam
Email addressesPhishing, spam
Medical informationHIPAA, privacy
Minor's namesChild protection
Witness identitiesSafety, legal
Attorney-client communicationsPrivilege
Trade secretsCompetitive harm

Don't Forget Hidden Data

PDFs contain more than visible text:

  • Metadata: Author name, creation date, software used, company name
  • Comments and annotations: Review notes that may contain sensitive discussion
  • Form field data: Previously entered values
  • Hidden layers: Content that's invisible but extractable
  • Embedded files: Attachments within the PDF
  • JavaScript: Scripts that may reference sensitive URLs or data
  • Bookmarks: May contain sensitive titles or notes
After redacting visible content, run "Remove Hidden Information" (Acrobat) or use MyPDF's Flatten PDF to strip all non-essential data.

Verification: Did the Redaction Work?

After redacting, always verify:

  1. Open the redacted PDF
  2. Select all text (Ctrl+A / Cmd+A)
  3. Copy (Ctrl+C / Cmd+C)
  4. Paste into a text editor (Notepad, TextEdit)
  5. Search for any sensitive terms that should have been removed
If any redacted text appears in the pasted content, the redaction failed. Go back and use a proper redaction tool.

Industry-Specific Requirements

Legal (US courts): Federal Rules of Civil Procedure require redaction of SSNs, taxpayer IDs, birth dates, minor names, and financial account numbers from all filings. Healthcare (HIPAA): All 18 HIPAA identifiers must be removed from documents shared outside the "minimum necessary" scope. Financial (GLBA, PCI-DSS): Customer financial data must be redacted before sharing externally.
Ad 728x90