What is robots.txt?

Answer

robots.txt is a plain-text file at the root of a domain (example.com/robots.txt) that tells search engine and AI crawlers which paths they can and cannot access. The file uses User-agent and Allow/Disallow directives. robots.txt is the primary lever for granting or revoking AI crawler access to a site.

How robots.txt works

Crawlers read the file before any other URL on the site. Each User-agent block defines rules for that bot. Allow grants access; Disallow blocks it. The last block applies until the next User-agent declaration. Comments start with #.

AEO best practices

Allow GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, CCBot and Applebot-Extended explicitly. Declare your sitemap.xml location at the bottom. Do not block JavaScript or CSS (Google needs them to render).

How to audit

Use the Readiness Check which tests robots.txt for blocks against all major AI crawlers. Validate syntax with Google's robots.txt Tester in Search Console.

Want help shipping AEO into your site?

Run the free 50-signal AI Agent Readiness Check or book a free scoping call.

Score my site

What is robots.txt?

How robots.txt works

AEO best practices

How to audit

Related questions

Want help shipping AEO into your site?