01 // HTML SITEMAP → CSV
HTML sitemap to CSV.
Paste any human-readable sitemap URL and get a clean URL list. Relative links resolved, anchors deduped, ready for crawler tools or AI workflows.
Supports XML sitemaps, sitemap-index files (auto-followed), HTML site indexes and plain-text URL lists.
When you need this
- Your
sitemap.xmlserves HTML (Next.js, Vercel edge, SPA fallbacks) - The site only publishes a visual sitemap, not an XML one
- You want URLs out of a footer link directory or "site index" page
- Bulk-pulling URLs from a category / archive page for content audits
What it doesn't do
It only extracts links from the page you give it — it doesn't recursively crawl. If you want to build a sitemap by walking the site, use the URL generator instead.
Frequently asked
FAQ
What's an HTML sitemap?
It's a regular webpage that lists every link on a site, usually at
/sitemap or /sitemap.html. Unlike sitemap.xml, it's meant for humans — but the link list inside is still a goldmine if you need a flat URL inventory.Why is my sitemap.xml actually HTML?
Common with Next.js, Vercel, single-page React/Vue apps, and some CDNs: the path
/sitemap.xml resolves to the app shell instead of the real XML. If that's happening, paste the HTML response here and we'll pull URLs out of it.Does it follow relative links?
Yes — relative
hrefs are resolved against the page URL, so you always get fully-qualified https://… URLs back.Are nav / footer links filtered out?
Not automatically. The tool extracts every
<a href> on the page, then dedupes. If your sitemap page also has site chrome, the chrome links will be in the output once — usually fine.