Check every signal that controls indexing in one place: the HTTP status, robots.txt, the robots meta tag, the X-Robots-Tag header and the canonical tag. Get a clear yes/no verdict and the exact reason a page is blocked.
⚡ Interactive demo — sample data
This sample page is NOT indexable — a robots meta noindex is blocking it from search results.
HTTP status: 200 — the page loads correctlyLooks good
robots.txt: allows crawling of this URL for GooglebotLooks good
Robots meta tag: noindex — Google can crawl this page but won't show it in resultsIssue
X-Robots-Tag header: not setLooks good
Canonical tag: points to https://example.com/other-page — Google may index that URL insteadWarning
Check every signal that controls indexing in one place: the HTTP status, robots.txt, the robots meta tag, the X-Robots-Tag header and the canonical tag. Get a clear yes/no verdict and the exact reason a page is blocked.
How it works
Enter the page URL
Paste the exact URL you want to check and run it. We fetch the page and follow any redirects to its final destination, then read every signal a search engine uses to decide whether the page is allowed to be indexed.
Read the verdict and signals
You get a clear yes/no — can Google index this page? — followed by each signal we checked: the HTTP status, robots.txt for that path, the robots meta tag, the X-Robots-Tag response header, and the canonical tag. Anything blocking the page is named in plain language.
Fix the blocker and re-run
If the page is blocked, the verdict tells you exactly why — for example a robots meta noindex or a robots.txt Disallow. Fix it in your CMS, server config, or robots file, then re-run to confirm the page is clear before search engines re-crawl.
What we check
HTTP status — Confirms the page returns a 200-range status after redirects. A 404, 410, 500 or similar means there's nothing for Google to index — a working page must respond 200 first, before any other signal matters.
robots.txt (Googlebot) — Reads the site's robots.txt and checks whether a Disallow rule blocks this URL's path for Googlebot (falling back to the wildcard user-agent group). A Disallow stops Google from crawling the page at all.
Robots meta tag — Looks for a <meta name="robots"> tag and flags noindex, which tells Google it's allowed to crawl the page but not show it in results. No tag defaults to index, follow.
X-Robots-Tag header — Checks the X-Robots-Tag HTTP response header for noindex. This is the server-side equivalent of the robots meta tag and is easy to miss because it's invisible in the page source — only visible in the response headers.
Canonical tag — Reads the rel="canonical" link. If it points to a different URL than the one you checked, we flag it: Google may treat that other URL as the version to index instead, which can quietly keep this page out of results.
Final URL after redirects — We resolve redirects and run every check against the page that actually loads, so a 301 to a noindexed or canonicalized destination is judged correctly rather than against the URL you typed.
Common issues we catch
noindex meta tag left on after launch — Staging sites and new builds often ship with a global noindex to keep them out of search. If it isn't removed at launch, Google crawls the live page but refuses to index it — a silent reason pages never appear, with nothing visibly wrong on the page.
robots.txt Disallow hides the noindex you meant to use — A subtle trap: if you Disallow a URL in robots.txt and also put a noindex on the page, Google can't crawl the page, so it never sees the noindex. The URL can then still appear in results as a bare link. To deindex a page, allow crawling and use noindex — don't block it in robots.txt.
X-Robots-Tag noindex no one can see — Because the X-Robots-Tag lives in the HTTP response header, not the HTML, it's invisible when you view the page source. A server, CDN, or framework default can apply a noindex header to a whole path, blocking pages that look perfectly fine in the browser.
Canonical pointing to the wrong URL — A canonical that points somewhere else — a copied template default, a trailing-slash mismatch, or a parameterized version — tells Google to index that other URL instead. The page you care about can then be passed over even though every other signal is green.
robots.txt blocks a whole section — A broad Disallow like /blog/ or a Disallow: / left from a staging file blocks every URL beneath it. One overly wide rule can keep an entire content section out of crawling without any error on the individual pages.
Confusing 'indexable' with 'indexed' — Passing every check here means Google is allowed to index the page — not that it already has. Whether a page is actually in the index depends on crawl scheduling, content quality and internal links. This tool confirms the door is open; getting through it is a separate step.
Redirect chain ending somewhere unexpected — The URL you submitted may 301 to a different page that is itself noindexed or canonicalized elsewhere. Checking the typed URL alone would miss it — which is why the verdict is judged against the final landing URL after redirects.
Where this matters
Google (Googlebot) — The robots.txt check targets the Googlebot user-agent group specifically, falling back to the wildcard (*) group, matching how Google decides whether it may crawl a URL.
Other search engines — The meta robots, X-Robots-Tag and canonical signals are honored by Bing and other major crawlers too, so a page that's indexable for Google is generally indexable for them — though robots.txt rules can be written per user-agent.
WordPress, Shopify, Wix & CMS platforms — These platforms expose noindex toggles (and ship 'discourage search engines' settings) that write the robots meta tag or header. This tool surfaces the result, so you can confirm a setting did what you expected.
CDNs, servers & frameworks — Server configs, CDNs and frameworks can inject an X-Robots-Tag header across whole paths. Because we read the live response header, we catch noindex rules applied at the infrastructure level, not just in the page.
Google Search Console — This tool checks the same on-page and header signals Search Console reports under coverage and the URL Inspection tool — a fast pre-check before you request indexing there.
Frequently asked questions
What's the difference between 'indexable' and 'indexed'?
Indexable means Google is allowed to index the page — nothing in robots.txt, the meta tag, the header, the status code, or the canonical is blocking it. Indexed means the page is actually in Google's results. This tool checks the former: it confirms the door is open, not that Google has already walked through it.
My page passes every check but still isn't in Google — why?
Being indexable doesn't guarantee being indexed. Google still has to crawl the page, decide it's worth keeping, and schedule it in. Thin content, weak internal linking, or a brand-new URL can all delay or prevent indexing even when every technical signal is green. Submitting it in Search Console can speed up discovery.
What does a robots meta noindex actually do?
It lets Google crawl the page but tells it not to show the page in search results. The page is still fetched and read — the noindex only controls whether it can appear in the index. To deindex a page, this is the right tool, as long as the page isn't also blocked in robots.txt.
Why shouldn't I block a page in robots.txt to keep it out of Google?
Because a robots.txt Disallow stops Google from crawling the page at all — so it never sees a noindex tag inside it. The URL can then still surface in results as a bare link with no description. To keep a page out of the index, allow crawling and add a noindex instead of blocking it.
What is a canonical tag and why does it matter here?
A canonical tag tells Google which URL is the preferred version of a page. If your page's canonical points to a different URL, Google may index that other URL instead of the one you checked — so even a fully crawlable, index-allowed page can be quietly left out of results. We flag any canonical that points elsewhere.
Where does the X-Robots-Tag come from if it's not in my HTML?
It's sent in the HTTP response headers by your server, CDN, or framework rather than written into the page. That makes it invisible in view-source, which is exactly why it catches people out. A platform default can apply a noindex header to a whole directory without touching any page's HTML.
Does it check the URL I typed or where it redirects to?
It follows redirects and runs every check against the final URL that actually loads. So if your URL 301s to another page, the verdict reflects that destination's status, robots rules, meta tag, header and canonical — not the URL you started with.
How soon will Google index a page after I fix a blocker?
The fix is live immediately, but Google has to re-crawl the page to notice it, which can take days to a few weeks. You can speed it up by requesting indexing for the URL in Google Search Console rather than waiting for the next natural crawl.
This is one of several free SEO tools from Custom Web Audits.
For a complete, prioritized analysis of your whole website,
run a full audit.