IndexerNow

Coverage states

Fix "Blocked by robots.txt" in Google Search Console

updated 2026-06-01
The symptom

Search Console: "Blocked by robots.txt" — the URL won't index.

TL;DR — the fix

Your robots.txt has a Disallow rule that matches the URL, so Google can't crawl it. Find the matching rule (robots.txt is prefix-matched), remove or narrow it if the block is accidental, redeploy, then push a re-crawl. Remember: robots.txt stops crawling, it does not deindex.

"Blocked by robots.txt" is one of the clearer Search Console statuses: Google found your URL, tried to crawl it, and your own robots.txt told it not to. The page can't be indexed with fresh content while that block stands. The good news is it's almost always a one-line fix.

Step 1: confirm which rule is blocking it

Open your robots.txt (yourdomain.com/robots.txt) and look for a Disallow line that matches the path. Remember that robots.txt is matched by prefix: Disallow: /blog blocks /blog, /blog/, and everything under it. A stray Disallow: / blocks your entire site — a classic staging-config-leaked-to-production disaster.

Use a tester instead of reading by eye

robots.txt matching has edge cases (wildcards, $ anchors, allow/disallow precedence) that are easy to misread. Paste your robots.txt and the URL into the robots.txt tester to see the exact rule that matches before you change anything.

Step 2: decide if the block is intentional

  • Intentional: admin areas, cart/checkout, internal search results, faceted-navigation traps, staging subdomains. Leave these blocked.
  • Accidental: your blog, product pages, or whole site blocked by a leftover rule, a CMS default, or a migrated staging robots.txt. Fix these.

Step 3: remove or narrow the rule

Delete the offending Disallow line, or narrow it so it only matches what you actually want blocked. If you need to block a section but still let Google read one URL inside it, add a more specific Allow line (Allow lines win over Disallow lines for longer-matching paths in Google's implementation).

robots.txt does not equal noindex

Blocking a URL in robots.txt doesn't remove it from Google — it just stops Google re-crawling it. A URL that was indexed before the block can linger in the index with stale content. And if you want a page out of the index, robots.txt is the wrong tool: Google can't read a noindex tag on a page it's not allowed to crawl. Allow the crawl, add noindex.

Step 4: re-crawl and confirm

  1. Deploy the updated robots.txt and load it in a browser to confirm it's live.
  2. Re-run the URL through the indexability checker — the block should clear once Google re-fetches robots.txt (usually within a day).
  3. Push the URL through the Indexing API to prompt a fresh crawl now that it's allowed.

IndexerNow runs on your own Google account, your own Cloud project, and your own quota — we never pool submissions through a shared account. Connect your GSC and push the URL through Google's Indexing API in two clicks.

Connect your own GSC and index now

Frequently asked

How long until the block clears after I fix robots.txt?

Google re-fetches robots.txt roughly once a day and caches it for about 24 hours. Once it sees the updated file the status clears on the next crawl. Pushing the URL through the Indexing API brings that crawl forward.

Does robots.txt remove a page from Google?

No. robots.txt only blocks crawling. A page indexed before you added the block can stay in the index with stale content. To deindex, allow the crawl and add a noindex tag instead.

Allow vs Disallow — which wins?

In Google's implementation, the rule with the longest matching path wins; on an exact-length tie, Allow beats Disallow. That's why a specific Allow line can carve one URL out of a broader Disallow.

Related fixes