PDF Tools
PDF Text Extractor
Extract text from any PDF in your browser. Page-by-page view, copy, and download as .txt. The file is never uploaded.
PDF file
Result
Drop a PDF above and the extracted text appears here along with page counts, word and character totals, and a per-page view. Everything stays in your browser; the file is never uploaded.
How to use
- Drop a .pdf onto the upload area or click Choose file. Up to 200 MB is supported.
- The extractor decodes the page content streams in your browser and renders the combined text along with stats: pages, words, characters, and how many pages contain text.
- Switch to the By page view to step through individual pages, jump to a specific page, and copy that page on its own.
- Use Copy all text or Copy page text to copy to the clipboard, or Download .txt to save the extracted text as a UTF-8 file.
- Pages without extractable text are almost always scanned images without an OCR layer; the tool reports the count so you know which pages need an OCR pass.
About this tool
PDF Text Extractor pulls the readable text out of a PDF without uploading the file. Drop a .pdf and the tool indexes every PDF object in the file, locates each page dictionary, decodes the page content streams (FlateDecode is decompressed natively in the browser using the Compression Streams API, and ASCIIHexDecode and ASCII85Decode are handled too), tokenizes the standard text-showing operators (Tj, TJ, ' and "), and returns the extracted text alongside word and character totals, the page count, and a per-page view you can step through. Strings are decoded with full PDF rules: parenthesized literals with backslash escapes (including octal forms), angle-bracket hex strings, UTF-16 BE strings with the FE FF byte order mark used by Word and Acrobat, UTF-8 BOM literals used by some newer toolchains, and PDFDocEncoding for everything else (the small handful of bytes that differ from ASCII are mapped to their proper Unicode equivalents). Line breaks are inferred from Td, TD, T* and the line-show operators ' and ", and large negative kerning adjustments in TJ arrays are treated as word breaks the way Adobe's own extractor does. Every step runs inside the tab. The file is read with file.arrayBuffer() into a typed-array view, kept in memory only for the time it takes to decode, and never transmitted to a server. That makes the tool safe for contracts, signed agreements, invoices, transcripts, medical records, school forms, and anything else you would rather not hand to a third-party SaaS. Pages without a text layer (scanned image-only PDFs) are reported as empty so you know why nothing came out; this tool does not perform OCR. Encrypted PDFs are detected and flagged so you can remove the password first. Use Copy all text, Copy page text, or Download .txt to send the result wherever you need it.
Free to use. Works in your browser. No signup, no login.
Related tools
You may also like
PDF Page Counter
Drop a PDF, get the page count, file size, version, and encryption status locally.
Open tool
PDFPDF Metadata Viewer
Drop a PDF, read the /Info dictionary and XMP packet locally, with PDF/A and encryption flags.
Open tool
PDFPDF Form Field Inspector
List every AcroForm field with type, value, flags, and options.
Open tool
PDFPDF Security Inspector
Drop a PDF and inspect its encryption handler, algorithm, key length, and all eight permission bits.
Open tool
TextHTML to Plain Text
Strip HTML tags and convert HTML to readable plain text with optional link URLs.
Open tool
TextWord Counter
Live word, character, sentence, paragraph, and reading time stats.
Open tool