GUIDE // 02

Sitemap best practices.

Cut through the cargo-cult advice. Here's what actually moves the needle when you ship a sitemap to Google in 2025.

Only include canonical, indexable URLs

Your sitemap is a "please crawl these" suggestion. If a URL is noindex, canonicalised to another URL, 301-redirected, or 404, it doesn't belong in the sitemap. Mixed-quality sitemaps train Google to trust the sitemap less.

Keep lastmod honest

Real lastmod values help Google prioritise re-crawls. Faking it by setting lastmod to "today" on every URL is a well-known anti-pattern — Google detects it and ignores the field on your sitemap. Better to omit lastmod than to lie.

Use sitemap-index for >50k URLs

The protocol limits a single sitemap to 50,000 URLs and 50 MB uncompressed. Above that you need a sitemap-index that references multiple child sitemaps. Most CMSes do this automatically.

Split by content type

Separate sitemaps per content type (pages, posts, products, images, videos) make problems easier to triage in Search Console. GSC reports coverage per sitemap, so if one type is having indexing issues you'll see it cleanly.

Submit once, ping on changes

Submit your sitemap in both Search Console and Bing Webmaster Tools. After that, you don't need to resubmit on every change — but ping /ping?sitemap=… when you push a major batch.

Validate before you ship

Run new sitemaps through the validator first. Most "sitemap could not be read" errors in GSC are catchable locally — bad date formats, invalid priority values, or URLs from a different hostname.

Things that don't matter

  • Priority & changefreq. Google has publicly said they ignore these. Set them if your platform does so, but don't lose sleep over them.
  • The exact filename. Google reads the URL declared in robots.txt or submitted in GSC. sitemap.xml is just a convention.
  • Pretty XML formatting. Minified is fine. Adds no SEO value, just makes the file bigger.

Frequently asked

FAQ

Should I use lastmod down to the second?
A simple YYYY-MM-DD is enough for almost every site. Full ISO 8601 with timestamp is valid but rarely needed.
Should I list every URL or only the important ones?
List every canonical, indexable URL. Cherry-picking "important" URLs gives Google less information, not more.
Does sitemap submission help if Google already knows my site?
Yes — it's the cleanest signal for discovery on large/dynamic sites and the only way to track per-URL indexing status in GSC.