# CitedRank > CitedRank is a free, no-signup web toolkit for marketers, SEO consultants, and AI engineers who need to understand any website quickly. It bundles six tools — sitemap discovery, single-page content extraction, on-page SEO audit, generative-engine-optimization (GEO) readiness check, design system extraction, and structured-field extraction — into one URL-paste interface. Built for the AI-search era: every tool outputs LLM-friendly Markdown or JSON. ## Platform overview CitedRank runs every tool on the same fast backend (Crawl4AI + Scrapling), shares a unified page cache across single-page tools (Crawl, SEO, Clone) so running any one of them on a URL makes the others sub-200ms, and exposes the same functionality via REST API and the UI. ## Tools ### Sitemap — https://citedrank.co/sitemap-extractor The free sitemap extractor that pulls every URL from any XML sitemap — even the hidden paths your sitemap.xml viewer misses. **Output:** URL list. **Scope:** whole domain. ### Crawl — https://citedrank.co/crawl Scrape any webpage to clean Markdown — for RAG, AI agents, and offline archiving. Plus image URLs and MHTML snapshots in one paste. **Output:** CONTENT — text + images. **Scope:** single URL. ### SEO — https://citedrank.co/seo-checker The free SEO checker that extracts every SEO signal from any page — title, meta, schema, headings, Open Graph, hreflang, link graph, media, content, tech. **Output:** AUDIT — title / meta / schema / links. **Scope:** single URL. ### GEO Audit — https://citedrank.co/geo-audit The free GEO audit that scores your AI search visibility — for ChatGPT, Perplexity, Gemini, and Google AI Overviews, before competitors steal the citations. **Output:** GEO audit report. **Scope:** single URL. ### Clone — https://citedrank.co/clone Free design system extractor — pull colors, fonts, CSS variables, and Tailwind config from any website in 15 seconds. **Output:** design system (colors / fonts / CSS). **Scope:** single URL. ### Extract — https://citedrank.co/extract The free AI web scraper that turns any URL into structured JSON — pick a template or write CSS selectors, no code. **Output:** structured JSON (rows + fields). **Scope:** single URL. ## Use cases - **Competitive content audit** — feed a competitor URL to Crawl + SEO, get their content as Markdown plus their meta-tag strategy in one pass. - **AI search visibility check (GEO)** — see whether ChatGPT, Perplexity, or Google AI Overviews can extract clean facts from a page; identify missing JSON-LD, FAQ schema, definition-first paragraphs. - **RAG ingestion** — Crawl a marketing or docs site to clean Markdown for vector-database ingestion. MHTML offline snapshots include images for multimodal RAG. - **Design system extraction** — Clone any production site to its core colors, fonts, and CSS variables; export as Tailwind config or CSS variables. - **Structured data scraping** — Extract product cards, search results, or Reddit threads as JSON via CSS selectors. Built-in templates for Hacker News, GitHub Trending, Product Hunt, Reddit, generic blogs. - **Sitemap discovery for migrations** — find every URL on a legacy site before a redesign so no page is lost to 404s. ## Pricing CitedRank is currently free during development. No signup required for sitemap discovery; the other tools accept a free API token retrieved via Google sign-in. ## FAQ ### What is GEO (Generative Engine Optimization)? GEO is the practice of structuring web content so AI search engines — ChatGPT, Claude, Perplexity, Google AI Overviews, Bing Copilot — can extract, summarize, and cite it accurately. It overlaps SEO but emphasizes machine-readable signals: JSON-LD schema (FAQPage, HowTo, Article), llms.txt presence, definition-first paragraphs, numbered HowTo steps, and tabular data over prose. ### What is the difference between Crawl and SEO in CitedRank? Crawl returns the **page's content** — clean Markdown body plus image URLs — for AI ingestion. SEO returns an **audit report** — title length, meta description, headings hierarchy, schema types, link graph — for diagnosis. They share a backend cache: running either populates the other instantly. ### What is the difference between Crawl and Extract? Crawl gives you the **whole page as Markdown** (one large text blob). Extract gives you **structured JSON rows** by applying CSS selectors to repeating items — e.g., 30 product cards as `[{title, price, rating}, ...]`. Use Crawl for content; Extract for tabular data. ### What is llms.txt? llms.txt is an emerging convention (https://llmstxt.org) — a Markdown file at `/llms.txt` that gives LLM crawlers structured context about a site. CitedRank's GEO tool flags missing llms.txt as a fixable issue. ### Can I use CitedRank from Claude Code, Cursor, or Claude Desktop? Yes. Three ways: 1. **MCP server** at `https://citedrank.co/mcp` — Streamable HTTP transport. Add to Cursor / Continue / Cline config and the agent gets 5 tools (sitemap, crawl, seo, clone, extract) registered automatically. For Claude Desktop wrap with `npx mcp-remote`. 2. **Claude Code skill** at `https://citedrank.co/skills/citedrank/SKILL.md` — single-file integration spec. 3. **Raw REST API** — see the API section below. ## API - POST `https://citedrank.co/api/sitemap?domain=X` — discover URLs - POST `https://citedrank.co/api/fetch_one` body `{url, refresh?}` — Crawl single page - POST `https://citedrank.co/api/seo` body `{url, refresh?}` — SEO audit - POST `https://citedrank.co/api/clone` body `{url, refresh?}` — design system extract - POST `https://citedrank.co/api/extract` body `{url, item_selector, fields[]}` — structured fields - All endpoints require `Authorization: Bearer ` (free, get via sign-in) ## Links - Home: https://citedrank.co - MCP server: https://citedrank.co/mcp - Claude Code skill: https://citedrank.co/skills/citedrank/SKILL.md - Sitemap: https://citedrank.co/sitemap.xml - robots.txt: https://citedrank.co/robots.txt _Last updated: 2026-05-20._