The free sitemap extractor that pulls every URL from any XML sitemap

even the hidden paths your sitemap.xml viewer misses.

What this tool does

The free sitemap extractor that goes deeper than the rest.

Most online sitemap extractors stop at /sitemap.xml. We don’t. We read robots.txt, follow nested sitemap indexes up to four levels deep, decompress .gz archives, and probe 15+ fallback paths that WordPress, Shopify, Webflow, and custom CMSes commonly use — so you actually get every URL on the site, not just the obvious ones.

Hidden sitemap.xml paths probed

15+ fallback URLs covered: wp-sitemap, sitemap_index, post-sitemap, page-sitemap, news-sitemap, image-sitemap and more.

Nested XML sitemap indexes resolved

Follows <sitemap> → <sitemap> chains up to 4 levels. Big sites split into 50k-URL files — we recurse them all.

Gzip sitemaps auto-handled

If the sitemap.xml is served as .gz, we decompress it transparently. You don't see the difference.

URL metadata preserved

lastmod, priority, and hreflang tags are kept on every URL extracted from the sitemap.

What you get

Every URL from the XML sitemap, sortable and ready to act on.

Every URL on the domain

Sorted by language (English first, then ZH, ES, …). Search-as-you-type filters by path.

lastmod + priority + language tag

The metadata Google sees, preserved on every row.

1-click follow-up on any URL

Run SEO audit, design Clone, or single-page Crawl directly from the result row.

Export to JSON via API

GET /api/sitemap?domain=… returns the full list as JSON. Pipe it into anywhere.

Who uses this

A free sitemap URL extractor for marketers, founders & engineers.

SEO consultants

Audit a prospect's content footprint before the first call. Know their full URL map in 30 seconds.

Site owners mid-migration

Export every legacy URL so your 301 redirect plan doesn't lose a single page or its SEO equity.

AI engineers building RAG

Feed a documentation site's entire URL list straight into your embedding pipeline.

Content strategists

Pull two competitors' sitemaps side-by-side and find the topics they cover that you don't.

How to use

Extract URLs from any sitemap in three steps.

No signup. No CAPTCHA. Just paste a domain and run.
  1. 1

    Paste a domain

    Type or paste any domain (e.g. stripe.com) into the input above. Full URLs work too — only the hostname is used.

  2. 2

    Click Run

    CitedRank reads robots.txt, follows nested sitemap indexes (up to 4 levels deep), and probes 15 fallback paths like /wp-sitemap.xml and /sitemap_index.xml.

  3. 3

    Browse or export

    Every URL is listed with its language, lastmod date, and a one-click button to run SEO, Clone, or Crawl on that page. Export the full list as JSON via the API.

What people say

Loved by people who paste URLs all day.

I run this on every prospect before our discovery call. Getting their entire URL map in 30 seconds tells me more than an hour of manual browsing — and the 1-click 'audit this URL' on each row is a killer time-saver.

Maya Chen
Independent SEO consultant

Before redesigning my pricing page I pulled Linear's and Stripe's sitemaps to map their info architecture. Free, no signup, no friction. The tool is what every indie hacker needs and nobody else was shipping.

Daniel Park
Indie founder · Building Notewise

Our WordPress → Webflow migration would have lost 230 URLs without CitedRank. The hidden-sitemap probe caught everything our team missed in robots.txt and nested indexes.

Sarah Goldberg
Content lead · B2B SaaS

Feeds straight into our embedding pipeline. We ingest ~50 documentation sites a month — way faster than maintaining our own crawler, and the lastmod dates let us skip stale pages.

Raj Patel
AI engineer · Ragstack.ai

The language-by-population sort is a nice touch — I work mostly with English+German sites and the relevant URLs cluster at the top instead of buried among 40 hreflang variants.

Lena Müller
Freelance SEO auditor

More tools

The rest of the CitedRank toolkit

A URL list is rarely the end of the job. These tools pick up where the sitemap leaves off.