How to get a PDF indexed by Google

5 min read · updated 2026-05-13

The short answer

Google does index PDFs and can rank them like any page — it crawls the file, extracts the text, and treats it as a document. To get yours indexed, make the PDF text-based (not a scanned image), link to it from a crawlable page, keep it un-blocked in robots.txt, and give it a descriptive filename and title. A sitemap entry speeds discovery.

Google treats a PDF much like an HTML page: it crawls it, extracts the text, and can index and rank it for relevant queries. White papers, manuals, datasheets, research, and forms all pull search traffic as PDFs. But a few PDF-specific quirks decide whether yours gets indexed at all.

What makes a PDF indexable

It must contain real, selectable text. A scanned-image PDF with no OCR text layer is invisible to Google — run OCR first.
It must be crawlable: linked from somewhere Google can reach, and not blocked by robots.txt or behind a login.
It should be a reasonable size; enormous files can be crawled partially or skipped.
A descriptive filename and a title set in the PDF's document properties both help Google understand and label it.

Make your PDFs discoverable

Link to the PDF from a relevant HTML page with descriptive anchor text — this is the main discovery path.
Include the PDF URL in your sitemap so Google knows it exists.
Set the PDF's Title and Author in its metadata; Google often uses the Title as the search result headline.
If the PDF is your own (on a domain you've verified), push its URL through the Indexing API like any other page.

Consider an HTML companion page

PDFs rank, but they're a worse user experience and harder to update than web pages. For important content, publish an HTML version as the primary, indexable page and offer the PDF as a download. You'll usually rank better and keep control of the layout, internal links, and analytics.

Confirm it indexed

Run the PDF's URL through a status check to confirm Google has it. If it's stuck, the usual cause is a missing text layer (scanned image) or a robots.txt block on the documents directory — fix that, then re-push.

Sign in with Google, paste your URLs, ship them through Google's Indexing API. Free daily quota, $9.99 for a 50-URL pack.

Try IndexerNow free