XML Sitemap
A structured file listing important URLs on a website to help search engines discover and crawl content efficiently.
Also known as: Sitemap, sitemap.xml, Sitemap Index
Category: Concepts
Tags: seo, web, technical, crawling, indexing
Explanation
An XML sitemap is a file (usually sitemap.xml) that lists the URLs you want search engines to know about. Each entry can include metadata: the URL itself, last modification date (lastmod), change frequency (changefreq), and priority relative to other pages. For large sites, sitemap index files can reference multiple individual sitemaps.
Why sitemaps matter:
- Help search engines discover pages that might not be found through internal links alone
- Signal which pages are most important and when they were last updated
- Essential for new sites, large sites, sites with rich media, or sites with poor internal linking
- Improve crawl efficiency by directing crawlers to valuable content
Sitemap types:
- Standard XML sitemaps: List regular web pages
- Image sitemaps: Help index images
- Video sitemaps: Help index video content
- News sitemaps: For Google News inclusion
- Sitemap index: References multiple sitemaps (required when exceeding 50,000 URLs per file)
Best practices:
- Only include canonical, indexable URLs (200 status code)
- Keep lastmod dates accurate—don't update them without real content changes
- Submit via Google Search Console and reference in robots.txt
- Remove URLs that return 404, redirect, or are noindexed
- Auto-generate sitemaps using your CMS or build tools
- Maximum 50,000 URLs and 50 MB per sitemap file
Sitemaps don't guarantee indexing—they're suggestions, not commands. They work best alongside good site architecture and internal linking.
Related Concepts
← Back to all concepts