Back to Blog

'Sitemap Could Not Be Read': How to Fix It in 2026

Seeing 'Sitemap could not be read' in Google Search Console? Here are the most common causes and exactly how to fix each one, step by step.

I
Indexly Team
· · 5 min read

'Sitemap Could Not Be Read': How to Fix It in 2026

If you just submitted your sitemap and Google Search Console threw back "sitemap could not be read," you are in the right place. This guide walks through every common cause of that error and the exact fix for each, so you can clear the status and get your pages crawled.

Table of contents

What the error actually means

"Sitemap could not be read" means Google reached the entry in your Sitemaps report but couldn't parse a valid XML sitemap at the URL you submitted. Search Console found something at that address, tried to read it as XML, and failed. The Status column shows the error instead of "Success."

It helps to separate two messages that look similar:

  • "Couldn't fetch" means Google could not even retrieve the file. The request failed, timed out, returned a server error, or was blocked before any content arrived.
  • "Sitemap could not be read" (also seen as "could not be read" or a generic parse error) means Google got a response, but the content was not a valid sitemap it could parse.

The fix depends on which side of that line you're on. A fetch failure points at networking, blocking, or a bad URL. A read failure points at the file itself: wrong format, broken XML, bad encoding, or wrong content type. Below, we cover both, because Search Console isn't always precise about which one tripped.

The 9 most common causes (and the fix for each)

1. Wrong sitemap URL submitted (typo or wrong path)

The simplest cause is the most common. You submitted /sitemap instead of /sitemap.xml, added a trailing slash, used http on an https site, or pointed at a path that never existed.

Fix: Open the exact URL from your Sitemaps report in a browser. If it 404s or redirects somewhere unexpected, you found the problem. Submit the canonical address (usually https://yourdomain.com/sitemap.xml), with the right protocol and no typos. If you're unsure of the conventions, our guide to submitting your sitemap the right way covers the exact steps.

2. Sitemap returns a 404 or 5xx error

If the file is missing (404) or your server is throwing errors (500, 502, 503), Google gets nothing usable. This often happens after a CMS migration, a plugin change, or a deploy that moved the file.

Fix: Load the sitemap URL in an incognito window. If you see an error page, the file isn't being served. Check headers from the command line:

curl -I https://yourdomain.com/sitemap.xml

You want HTTP/2 200. Anything in the 400s or 500s has to be fixed at the server or CMS level before Google will read the file.

3. robots.txt blocks the sitemap or Googlebot

Your robots.txt can quietly block the sitemap path or Googlebot itself. A Disallow rule that catches the sitemap's directory, or a rule blocking the crawler, stops Google from reading the file.

Fix: Visit https://yourdomain.com/robots.txt and confirm nothing blocks the sitemap path or Googlebot. Then add an absolute reference so crawlers can always find it:

Sitemap: https://yourdomain.com/sitemap.xml

That line can sit anywhere in the file. For the full picture, read how robots.txt and sitemaps interact.

4. Wrong Content-Type header

Google expects the file served as application/xml or text/xml. Some servers and CMS plugins hand it back as text/html, and the parser refuses it.

Fix: Check the header:

curl -I https://yourdomain.com/sitemap.xml

Look for:

content-type: application/xml; charset=utf-8

If it says text/html, fix the server config or the plugin that generates the file. Indexly hosts every sitemap with the correct application/xml content type by default, so this class of error simply doesn't happen.

5. It's actually an HTML page, not XML

This trips up a lot of JavaScript frameworks and CMS setups. The URL returns a styled, human-readable page (a "sitemap" page with links) instead of raw XML. Google can't parse a web page as a sitemap.

Fix: View the source of the URL. The first line should be the XML declaration, not <!DOCTYPE html>. A valid XML sitemap starts like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://yourdomain.com/</loc>
  </url>
</urlset>

If you're seeing HTML, you submitted the wrong URL or your generator is producing a page instead of a feed. See what a valid XML sitemap looks like for the full spec.

6. Invalid or malformed XML

XML is strict. A single unescaped character breaks the whole file. The usual culprits: raw &, <, and > characters inside URLs, a missing namespace, or an unclosed tag.

Fix: Inside <loc> values, special characters must be escaped: & becomes &amp;, < becomes &lt;, and > becomes &gt;. So a URL with a query string looks like this:

<loc>https://yourdomain.com/search?q=shoes&amp;sort=price</loc>

Make sure the opening <urlset> carries the namespace shown above, and every <url> and <loc> tag is properly closed. A quick way to catch these: paste the file into any XML validator and let it flag the broken line.

7. Encoding or BOM issues

The file has to be UTF-8. A byte-order mark (BOM) or any stray whitespace before the <?xml declaration breaks parsing. The XML declaration must be the very first bytes of the file, with nothing in front of it.

Fix: Save the file as UTF-8 without BOM. In most editors that's a save option or an "encoding" dropdown. Remove any blank lines or spaces before <?xml version="1.0". If your generator inserts a BOM, switch generators or strip it in your build step. Indexly always outputs clean UTF-8 with no BOM and no leading whitespace.

8. Gzip or compression misconfigured

Sitemaps can be gzipped (a .xml.gz file), which is fine when done right. It breaks when the server sends a .xml URL with a Content-Encoding: gzip header but uncompressed bytes, or vice versa. The mismatch makes the content unreadable.

Fix: If you serve sitemap.xml.gz, make sure the file is genuinely gzipped and the headers match. If you serve a plain .xml, don't apply gzip content-encoding to it manually. When in doubt, serve uncompressed XML — it's smaller than you think and removes a whole category of bugs.

9. A firewall or CDN is blocking Googlebot

Aggressive bot protection is a rising cause of this error. Cloudflare's bot-fight mode, a strict WAF rule, or rate limiting can serve Googlebot a challenge page or a 403 instead of your sitemap. You see the file fine in a browser, but Google gets blocked.

Fix: Check your CDN or firewall logs for blocked Googlebot requests. In Cloudflare, ease up on bot-fight mode for the sitemap path and confirm verified bots are allowed through. Use Search Console's URL Inspection or the live test to confirm Google can now reach the file. A correctly served, always-available sitemap URL — which is exactly what Indexly gives you — sidesteps the flaky-origin and aggressive-firewall problems entirely.

A 60-second checklist

Run through these in order. Most "sitemap could not be read" errors fall out by step 4.

  1. Open the exact submitted URL in an incognito browser. Does it load? (Catches 404s, 5xx, redirects.)
  2. View source. Does it start with <?xml, not <!DOCTYPE html>? (Catches HTML-not-XML.)
  3. Run curl -I on the URL. Is the content type application/xml or text/xml? (Catches wrong Content-Type.)
  4. Check robots.txt. Is the sitemap path or Googlebot blocked? (Catches robots blocks.)
  5. Paste the file into an XML validator. Any errors? (Catches malformed XML and unescaped characters.)
  6. Confirm the file is UTF-8 with no BOM and nothing before <?xml. (Catches encoding issues.)
  7. Check your CDN/firewall logs for blocked Googlebot hits. (Catches bot-protection blocks.)
  8. Resubmit in the Sitemaps report and watch the Status column.

FAQ

How long does it take for the error to clear after I fix it? After you fix the underlying issue and resubmit, the Status column usually updates within a few days as Google re-fetches the file. You can speed up confirmation with the live URL inspection tool, but the Sitemaps report itself refreshes on Google's own schedule, not instantly.

My sitemap loads fine in my browser. Why does Google still say it can't be read? A browser request and a Googlebot request aren't the same. A firewall, CDN, or WAF can serve Googlebot a challenge page or 403 while letting you through. Wrong content-type headers also pass in a browser but fail Google's strict XML parser. Check headers and bot rules.

Does "sitemap could not be read" hurt my rankings? The error itself doesn't directly penalize rankings, but it means Google isn't using your sitemap to discover pages. New or updated URLs may take longer to get crawled and indexed. Fixing it restores a reliable discovery path. See why Google isn't indexing your pages for related causes.

Can a sitemap that worked before suddenly break? Yes, and it happens often. A CMS update, a new firewall rule, a plugin change, an SSL renewal, or a deploy that moves the file can all break a previously working sitemap. Set-and-forget generators rot over time. A hosted, monitored sitemap like Indexly's stays valid and reachable.

Should I submit a sitemap index or a single sitemap? For most sites, a single sitemap under 50,000 URLs and 50MB is fine. Larger sites use a sitemap index that points to multiple child sitemaps. Either works, but the index file and every child it references must each be valid XML and reachable, or the same read error appears.

The bottom line

"Sitemap could not be read" almost always comes down to one of nine things: a wrong URL, a server error, a robots block, a bad content type, HTML masquerading as XML, malformed XML, an encoding problem, broken compression, or a firewall blocking Googlebot. Work the 60-second checklist and you'll find your culprit fast.

The deeper fix is to stop hand-maintaining a file that quietly breaks. Indexly crawls your site, generates a valid UTF-8 XML sitemap, and hosts it on a permanent URL served with the right content type — then keeps it current as your pages change, with new and removed URL tracking and email alerts when something shifts. Most causes on this list simply can't happen.

Tired of fighting this error? Start free at indexly.dev and get a sitemap that just works.

I

Indexly Team

Writing about SEO, sitemaps, and how to get every page indexed by Google.

Enjoyed this post?

Get our next one delivered to your inbox — no spam, ever.

Back to Blog

Ready to get your site fully indexed?

Get started free