What category does Robots.txt belong to?

Robots.txt belongs to the "Concepts" category in personal knowledge management and productivity.

What are the key topics related to Robots.txt?

Key topics related to Robots.txt include: seo, web, technical, crawling, protocols.

What are alternative names for Robots.txt?

Robots.txt is also known as: Robots Exclusion Protocol, robots.txt file.

Robots.txt

A text file placed at the root of a website that tells web crawlers which pages or sections to crawl or skip.

Also known as: Robots Exclusion Protocol, robots.txt file

Category: Concepts

Tags: seo, web, technical, crawling, protocols

Explanation

Robots.txt (the Robots Exclusion Protocol) is a plain text file at a site's root (e.g., example.com/robots.txt) that communicates crawling rules to web robots. It uses directives like User-agent (which crawler the rules apply to), Disallow (paths to skip), Allow (exceptions within disallowed directories), Crawl-delay (time between requests), and Sitemap (location of XML sitemaps).

How it works:

1. A crawler visits a site and first checks for /robots.txt
2. It reads the directives applicable to its User-agent
3. It follows (or ignores, for non-compliant bots) the specified rules
4. It proceeds to crawl allowed pages

Important limitations:

- Robots.txt is advisory, not enforced—malicious bots can ignore it
- It does not prevent indexing if other sites link to disallowed pages
- Disallowing a page doesn't remove it from search results (use noindex for that)
- Blocking CSS/JS files can hurt rendering and SEO

Common use cases:

- Prevent crawling of admin areas, staging environments, or internal search results
- Block crawling of duplicate content or low-value pages
- Manage crawl budget by directing bots to important content
- Point crawlers to the XML sitemap

Best practices: keep it simple, test with Google Search Console's robots.txt tester, never use it to hide sensitive information (it's publicly readable), and combine it with meta robots tags for full control.

Related Concepts

← Back to all concepts