GUIDE // 02
Sitemap best practices.
Only include canonical, indexable URLs
Your sitemap is a "please crawl these" suggestion. If a URL is noindex, canonicalised to another URL, 301-redirected, or 404, it doesn't belong in the sitemap. Mixed-quality sitemaps train Google to trust the sitemap less.
Keep lastmod honest
Real lastmod values help Google prioritise re-crawls. Faking it by setting lastmod to "today" on every URL is a well-known anti-pattern — Google detects it and ignores the field on your sitemap. Better to omit lastmod than to lie.
Use sitemap-index for >50k URLs
The protocol limits a single sitemap to 50,000 URLs and 50 MB uncompressed. Above that you need a sitemap-index that references multiple child sitemaps. Most CMSes do this automatically.
Split by content type
Separate sitemaps per content type (pages, posts, products, images, videos) make problems easier to triage in Search Console. GSC reports coverage per sitemap, so if one type is having indexing issues you'll see it cleanly.
Submit once, ping on changes
Submit your sitemap in both Search Console and Bing Webmaster Tools. After that, you don't need to resubmit on every change — but ping /ping?sitemap=… when you push a major batch.
Validate before you ship
Run new sitemaps through the validator first. Most "sitemap could not be read" errors in GSC are catchable locally — bad date formats, invalid priority values, or URLs from a different hostname.
Things that don't matter
- Priority & changefreq. Google has publicly said they ignore these. Set them if your platform does so, but don't lose sleep over them.
- The exact filename. Google reads the URL declared in robots.txt or submitted in GSC.
sitemap.xmlis just a convention. - Pretty XML formatting. Minified is fine. Adds no SEO value, just makes the file bigger.
Frequently asked
FAQ
Should I use lastmod down to the second?
YYYY-MM-DD is enough for almost every site. Full ISO 8601 with timestamp is valid but rarely needed.