XML Sitemap Health: The Map Google Uses to Find Your Pages (If It's Not Broken)
You've got 500 pages on your site but Google has only indexed 200. Or Search Console shows "Discovered - currently not indexed" on your best content. The culprit? Your XML sitemap is either missing, broken, or listing pages Google should ignore. It's like giving someone directions with half the addresses wrong.
What Is XML Sitemap Health?
An XML sitemap is a file (usually at yoursite.com/sitemap.xml) that lists every page you want search engines to index. Key components:
- URL List: Every page you want Google to crawl
- Priority: Which pages matter most (0.0 to 1.0 scale)
- Change Frequency: How often pages update
- Last Modified Date: When content was last changed
Think of it as a restaurant menu for Google. You're telling the bot "here's everything we serve, start with these dishes, and ignore the kitchen." A good sitemap lists only indexable, valuable pages. A bad one lists everything including 404s, redirects, and duplicate content—confusing Google and wasting crawl budget.
Why It Matters
For your visitors: Sitemaps don't directly affect user experience, but they determine whether your content shows up in search results. No sitemap = visitors can't find you because Google doesn't know those pages exist.
For search rankings: Google discovers pages through links and sitemaps. If your internal linking is weak or pages are buried deep, the sitemap might be the only way Google finds them. Sites with broken sitemaps often have 30-50% of their content never indexed. You can't rank if you're not in the index.
For your bottom line: Every page Google doesn't index is potential traffic and revenue lost. We've seen e-commerce sites with 1,000 products where only 300 were in Google's index because their sitemap was broken. That's 700 products invisible to search—essentially non-existent to customers searching on Google.
Impact Summary:
User Experience: Indirect
SEO Impact: Critical
Traffic Effect: High
Difficulty to Fix: Easy
Who Should Handle This?
Business Owner: Verify sitemap exists; check Search Console for coverage issues
Marketing Manager: Monitor indexation rates; flag when new content isn't indexed
Developer/SEO: Generate and maintain sitemap; fix errors; submit to Search Console
For most small businesses, your CMS (WordPress, Shopify) should auto-generate your sitemap via plugins like Yoast or RankMath. If you're on a custom build, your developer needs to create and maintain it manually.
What to Look For in Your Audit
Green Flags (You're Good)
- Sitemap exists and loads at yoursite.com/sitemap.xml
- Submitted to Google Search Console and Bing Webmaster Tools
- Only contains indexable pages (no 404s, redirects, or noindex pages)
- Updates automatically when you add/remove content
- Coverage in Search Console shows 90%+ of sitemap URLs indexed
Yellow Flags (Needs Attention)
- Sitemap hasn't been submitted to Search Console
- Contains some redirected or 404 URLs (under 5%)
- Last modified dates are inaccurate or all the same
- Indexation rate is 70-90%
Red Flags (Fix Immediately)
- No sitemap exists (returns 404)
- Sitemap contains 20%+ errors (404s, redirects, noindex pages)
- Lists thousands of low-value pages (tags, archives, filters)
- Search Console shows "Couldn't fetch" or "Is in HTML" errors
- Coverage shows most URLs "Discovered - currently not indexed"
- Sitemap hasn't been updated in 6+ months despite adding content
Benchmark Reference:
Coverage: Good 90%+ | Needs Work 70-90% | Poor <70%
Errors: Good <5% | Concerning 5-20% | Bad >20%
Updates: Should refresh when content changes
Best Practices
Keep it clean: Only include pages you want indexed. Don't list admin pages, thank-you pages, login pages, or duplicate content. Quality over quantity—a 100-page sitemap with all winners beats a 10,000-page sitemap full of junk.
Split large sitemaps: Google recommends keeping sitemaps under 50MB and 50,000 URLs. If you're bigger, split into multiple sitemaps and use a sitemap index file to organize them.
Exclude noindex pages: If a page has a noindex tag, it shouldn't be in your sitemap. This creates a mixed signal that confuses Google and can slow indexing of your good pages.
Monitor in Search Console: Check the "Sitemaps" and "Coverage" reports monthly. Look for errors, pages marked as "Excluded," and discrepancies between submitted and indexed URLs.
Quick Win: Go to Google Search Console > Sitemaps right now. If you haven't submitted your sitemap, enter "sitemap.xml" and click Submit. Then check the Coverage report—if you see hundreds of "Discovered - currently not indexed" pages, your sitemap likely has issues.
Our Take
In our experience, sitemap issues are the silent killer of organic traffic. Unlike robots.txt (which breaks everything dramatically), broken sitemaps fail slowly. Google indexes some pages but not others, and businesses don't notice until they realize half their blog posts from the past year have zero traffic.
The most common mistake is using auto-generated sitemaps without ever checking what they include. WordPress plugins, for example, often add category pages, tag pages, author archives, and pagination—bloating your sitemap with thousands of low-value URLs. Google wastes crawl budget on these instead of your money pages. We've seen 10,000-URL sitemaps where only 2,000 URLs actually mattered.
Here's the hard truth: Submitting a sitemap full of errors is worse than having no sitemap at all. Google loses trust in your sitemap's accuracy and starts ignoring it. If Search Console shows your sitemap has 30% errors, Google might deprioritize your entire site in crawling. Fix the errors first, then resubmit—don't just hope Google figures it out. And if you're relying on a sitemap from 2019 that hasn't been updated, you're essentially invisible for any content published since then.
See exactly what's hurting your website
Start free with our instant SEO tools — or run the all-in-one audit: SEO, speed, accessibility, content, AI visibility & conversion, in one report.