March 26, 20269 min read

Export Indian Language Text to PDF, Word, and Plain Text

How document export works for Indian language text from TranslitHub — font embedding, formatting preservation, PDF generation, Word compatibility, and batch export workflows.

export pdf word document indian languages
Ad 336x280

The most frustrating moment when working with Indian language text is finishing a document and then watching it break the moment you try to share it. The Devanagari renders as boxes. The Tamil becomes question marks. The carefully laid-out Gujarati letter collapses into unreadable gibberish when the recipient opens it on a machine without the right font installed.

Document export from TranslitHub is specifically designed to avoid these problems. This guide explains how each export format handles Indian scripts, what gets preserved and what doesn't, and how to choose the right format for different use cases.

Why Indian Language Export Is Harder Than Latin Script

With English text, you export a Word document and it looks the same everywhere because Times New Roman and Arial are embedded in every OS since the early 1990s. With Indian language text, the situation is more complicated:

Script complexity: Indic scripts use combining characters — vowel diacritics (matras) that attach to consonants, half-forms, conjuncts (two or more consonants merged into a single glyph). Rendering these correctly requires a font that explicitly includes all glyph variants plus an OpenType shaping engine to select the right variant in context. Font availability: Not every computer has Mangal, Latha, Noto Sans Devanagari, or the other fonts needed for Indian scripts. A document that looks fine on the computer it was created on can be unreadable on another machine. Encoding history: Indian language content has a complicated history of non-Unicode encodings (ISCII, ASCII art fonts like Kruti Dev, Shivaji, etc.) where different fonts produced different characters from the same code points. Unicode fixed this, but legacy documents still exist, and if your tool exports in a legacy encoding, recipients may see garbage.

TranslitHub exports exclusively in Unicode (UTF-8/UTF-16), and the PDF export embeds fonts. This combination solves all three problems.

Exporting to PDF

PDF is the safest format for Indian language documents that need to look consistent everywhere. When TranslitHub exports a PDF, it:

  1. Renders the document at screen quality using the web font loaded in the editor
  2. Embeds a subset of that font (only the glyphs actually used in your document) into the PDF file
  3. Produces a file that renders identically on macOS, Windows, Linux, iOS, Android, and any standard PDF viewer — regardless of what fonts are installed

What Gets Preserved in PDF

ElementPreserved
Bold, italic, underlineYes
Headings (H1, H2, H3)Yes — with size and weight differences
Bullet and numbered listsYes
Text alignment (including justified)Yes
Font sizeYes
Script rendering (conjuncts, matras)Yes
HyperlinksYes — clickable in PDF viewers
Page marginsSet to A4 standard (adjustable)

PDF Page Sizes

The PDF export defaults to A4 (210×297mm) with 25mm margins — standard for formal documents in India. You can change this to:


  • Letter (US standard, for international recipients)

  • Legal (for court filings and legal documents)

  • A5 (half-A4, useful for booklets)


For official government correspondence, stick with A4. For affidavits and court documents, check the specific court's requirements — some require specific font sizes and margins.

Practical Scenario: A Formal Letter in Marathi

A school principal needs to send a circular to parents in Marathi. She:


  1. Types the letter in the TranslitHub editor using Marathi (phonetic input)

  2. Uses justified alignment (standard for formal Marathi correspondence)

  3. Sets font size to 12pt with 1.5 line spacing for readability

  4. Exports to PDF


The resulting PDF can be emailed to parents, printed from any computer, or uploaded to the school's website — it looks identical in every context.

Exporting to DOCX (Word)

Word export creates a .docx file compatible with Microsoft Word, Google Docs, LibreOffice Writer, and WPS Office. It's the right format when you need to:

  • Send a document for review or editing by someone else
  • Continue working on the document in Word later
  • Submit to a publisher or platform that requires .docx
  • Collaborate with others who don't use TranslitHub

Font Selection for DOCX

Because DOCX files don't embed fonts (they reference fonts by name), the document depends on the recipient having a compatible font installed. TranslitHub DOCX exports use these fonts:

ScriptFont Used in DOCX
Devanagari (Hindi, Marathi, Sanskrit)Mangal
BengaliVrinda
TamilLatha
TeluguGautami
KannadaTunga
MalayalamKartika
GujaratiShruti
Gurmukhi (Punjabi)Raavi
OdiaKalinga
These are all fonts that ship with Windows by default (since Windows XP). On macOS and Linux, equivalents are usually available, but occasionally a recipient may see fallback fonts — functional but visually different.

If you need guaranteed visual consistency for a DOCX file, you have two options:


  • Tell recipients to install the Noto family of fonts (free, excellent Indian language support)

  • Export to PDF instead


What Gets Preserved in DOCX

ElementPreserved
Bold, italic, underlineYes
HeadingsYes — mapped to Word heading styles
ListsYes
Text alignmentYes
Font sizeYes
Script renderingDepends on recipient's font
Editable textYes — fully editable in Word

Importing Back After Editing

If a colleague edits your DOCX in Microsoft Word and sends it back, you can open it in TranslitHub's editor (use File → Import). The Unicode text comes back correctly. Formatting may differ slightly depending on what was changed in Word, but the actual Indian language characters are preserved.

Exporting to Plain Text (TXT)

TXT export strips all formatting and gives you raw Unicode text in UTF-8 encoding. This is appropriate when:

  • Submitting content to a CMS that handles its own formatting
  • Providing data to a developer or database
  • Copying into apps that don't support rich text
  • Using the text in Python/JavaScript scripts

UTF-8 vs UTF-16: Which One?

TranslitHub's default TXT export is UTF-8, which is the right choice for nearly everything — web content, databases, APIs, email, SMS. UTF-16 is available for legacy systems that specifically require it (some older Windows applications), but if you don't know what your recipient needs, use UTF-8.

Line Endings

Windows apps expect CRLF (\r\n) line endings. Linux and macOS use LF (\n) only. By default, the TXT export uses LF (Unix style), which is readable on all platforms. If a Windows program displays everything on one line, use the "Windows line endings" option in the export dialog.

Batch Export

For content creators, publishers, or educators who work with multiple documents, batch export lets you export several documents at once.

How Batch Export Works

  1. Open multiple documents in the editor (tabs)
  2. Go to File → Export All
  3. Choose format (PDF, DOCX, or TXT)
  4. TranslitHub generates all files and downloads them as a ZIP archive

Naming Convention

Files in the batch export are named according to the first line of each document (or the document title if you've set one). Special characters are removed from filenames to ensure compatibility across operating systems.

Use Case: Hindi Worksheets for a Teacher

A Hindi teacher prepares 30 grammar worksheets for students. She writes each worksheet in a separate TranslitHub document. At the end, she batch-exports all 30 as PDFs. The ZIP download contains 30 properly named, font-embedded PDFs ready to distribute or print.

Exporting for Specific Platforms

For WhatsApp / Telegram

Export as TXT, then open the file, select all, and paste. Unicode Indian language text pastes correctly into WhatsApp Web, Telegram Web, and most chat apps. Don't use DOCX for messaging apps — they'll attach the file rather than displaying the text.

For Email (Gmail, Outlook)

Copy from the TranslitHub editor and paste directly into Gmail or Outlook. The Unicode text pastes as-is. If you need a formatted document as an attachment, attach the PDF export.

For WordPress

Use the "Copy as HTML" button in the editor, then paste into WordPress's HTML editor (Text view, not Visual). This preserves headings, bold/italic, and list structure alongside the Unicode Indian language text. Alternatively, use the TranslitHub widget to type directly in WordPress without copy-pasting.

For Google Docs

Open Google Docs, paste from the editor. Formatting is usually preserved well. If the Indian language font looks different from what you intended, change the font in Google Docs to "Noto Sans [Language]" — Google Docs has excellent Noto font support.

Font Rendering Quality Check

Before finalizing a document for distribution, it's worth doing a quick rendering check:

  1. Export as PDF
  2. Open the PDF on a different device (phone, tablet, or ask a colleague)
  3. Check that complex characters like Hindi conjuncts (क्ष, त्र, ज्ञ), Tamil compound letters, or Malayalam conjuncts display correctly
  4. Check that matras are positioned correctly — they should attach to the right consonant, not float independently
If you see rendering issues, try a different font size (sometimes very small text has rendering artifacts) or report the specific character sequence to TranslitHub support — glyph rendering issues are usually fixed quickly.

Common Export Problems and Fixes

ProblemCauseFix
PDF text appears as boxesPDF viewer doesn't support embedded fontsUse Adobe Acrobat or a modern browser's built-in PDF viewer
DOCX text garbled in old WordWord version predates Unicode shapingUpdate Word, or use PDF instead
TXT file shows ? charactersFile opened with wrong encodingOpen with UTF-8 encoding in Notepad++ or VS Code
Hindi numerals instead of ArabicpreserveNumbers was falseRe-export with numbers toggle on
Extra line breaks in TXTLine ending mismatchRe-export with Windows line endings option
Ad 728x90