Crawl Budget
The number of pages a search engine will crawl on a site within a given timeframe, influenced by crawl rate and crawl demand.
Also known as: Crawl Rate, Crawl Demand
Category: Concepts
Tags: seo, web, technical, search-engines, crawling
Explanation
Crawl budget is the combination of two factors: crawl rate limit (how many requests per second a search engine makes without overloading the server) and crawl demand (how much the engine wants to crawl based on page importance, freshness, and popularity). Together, these determine how many pages get crawled in a given period.
For small sites (under a few thousand pages), crawl budget rarely matters—search engines will find everything. It becomes critical for large sites (100k+ pages), sites with lots of dynamically generated URLs, or sites with significant duplicate content. If the budget runs out before important pages are reached, those pages won't be indexed.
Factors that waste crawl budget include: duplicate content, faceted navigation generating infinite URL variations, soft 404 errors, session ID parameters in URLs, redirect chains, and low-quality pages. Factors that improve crawl efficiency include: clean site architecture, updated XML sitemaps, proper robots.txt directives, fast server response times, and internal linking to important pages.
Key optimization strategies:
- Block unimportant pages via robots.txt
- Use canonical URLs to consolidate duplicate content
- Fix or remove broken pages and redirect chains
- Keep server response times fast
- Maintain an accurate XML sitemap
- Use internal links to signal page importance
Google Search Console's Crawl Stats report shows how Googlebot interacts with your site, including pages crawled per day, response times, and crawl errors.
Related Concepts
← Back to all concepts