Custom Web Audits
Run Audit Demo Audit Audit Types Free Tools Pricing What We Analyze 🔒 Login
← All audit checks
Audit Guide · 5 min read
XML Sitemap Health

XML Sitemap Health: The Map Google Uses to Find Your Pages (If It's Not Broken)

You've got 500 pages on your site but Google has only indexed 200. Or Search Console shows "Discovered - currently not indexed" on your best content. The culprit? Your XML sitemap is either missing, broken, or listing pages Google should ignore. It's like giving someone directions with half the addresses wrong.

What Is XML Sitemap Health?

An XML sitemap is a file (usually at yoursite.com/sitemap.xml) that lists every page you want search engines to index. Key components:

Think of it as a restaurant menu for Google. You're telling the bot "here's everything we serve, start with these dishes, and ignore the kitchen." A good sitemap lists only indexable, valuable pages. A bad one lists everything including 404s, redirects, and duplicate content—confusing Google and wasting crawl budget.

Why It Matters

For your visitors: Sitemaps don't directly affect user experience, but they determine whether your content shows up in search results. No sitemap = visitors can't find you because Google doesn't know those pages exist.

For search rankings: Google discovers pages through links and sitemaps. If your internal linking is weak or pages are buried deep, the sitemap might be the only way Google finds them. Sites with broken sitemaps often have 30-50% of their content never indexed. You can't rank if you're not in the index.

For your bottom line: Every page Google doesn't index is potential traffic and revenue lost. We've seen e-commerce sites with 1,000 products where only 300 were in Google's index because their sitemap was broken. That's 700 products invisible to search—essentially non-existent to customers searching on Google.

Impact Summary:
User Experience: Indirect
SEO Impact: Critical
Traffic Effect: High
Difficulty to Fix: Easy

Who Should Handle This?

Business Owner: Verify sitemap exists; check Search Console for coverage issues

Marketing Manager: Monitor indexation rates; flag when new content isn't indexed

Developer/SEO: Generate and maintain sitemap; fix errors; submit to Search Console

For most small businesses, your CMS (WordPress, Shopify) should auto-generate your sitemap via plugins like Yoast or RankMath. If you're on a custom build, your developer needs to create and maintain it manually.

What to Look For in Your Audit

Green Flags (You're Good)

Yellow Flags (Needs Attention)

Red Flags (Fix Immediately)

Benchmark Reference:
Coverage: Good 90%+ | Needs Work 70-90% | Poor <70%
Errors: Good <5% | Concerning 5-20% | Bad >20%
Updates: Should refresh when content changes

Best Practices

Keep it clean: Only include pages you want indexed. Don't list admin pages, thank-you pages, login pages, or duplicate content. Quality over quantity—a 100-page sitemap with all winners beats a 10,000-page sitemap full of junk.

Split large sitemaps: Google recommends keeping sitemaps under 50MB and 50,000 URLs. If you're bigger, split into multiple sitemaps and use a sitemap index file to organize them.

Exclude noindex pages: If a page has a noindex tag, it shouldn't be in your sitemap. This creates a mixed signal that confuses Google and can slow indexing of your good pages.

Monitor in Search Console: Check the "Sitemaps" and "Coverage" reports monthly. Look for errors, pages marked as "Excluded," and discrepancies between submitted and indexed URLs.

Quick Win: Go to Google Search Console > Sitemaps right now. If you haven't submitted your sitemap, enter "sitemap.xml" and click Submit. Then check the Coverage report—if you see hundreds of "Discovered - currently not indexed" pages, your sitemap likely has issues.

Our Take

In our experience, sitemap issues are the silent killer of organic traffic. Unlike robots.txt (which breaks everything dramatically), broken sitemaps fail slowly. Google indexes some pages but not others, and businesses don't notice until they realize half their blog posts from the past year have zero traffic.

The most common mistake is using auto-generated sitemaps without ever checking what they include. WordPress plugins, for example, often add category pages, tag pages, author archives, and pagination—bloating your sitemap with thousands of low-value URLs. Google wastes crawl budget on these instead of your money pages. We've seen 10,000-URL sitemaps where only 2,000 URLs actually mattered.

Here's the hard truth: Submitting a sitemap full of errors is worse than having no sitemap at all. Google loses trust in your sitemap's accuracy and starts ignoring it. If Search Console shows your sitemap has 30% errors, Google might deprioritize your entire site in crawling. Fix the errors first, then resubmit—don't just hope Google figures it out. And if you're relying on a sitemap from 2019 that hasn't been updated, you're essentially invisible for any content published since then.

See exactly what's hurting your website

Start free with our instant SEO tools — or run the all-in-one audit: SEO, speed, accessibility, content, AI visibility & conversion, in one report.

More audit guides

Cookie ComplianceResponsive ImagesCWAKOM - Keyword OptimizationSSL/HTTPS SecurityConversion Optimization