robots.txt is a plain-text file at the root of a domain (example.com/robots.txt) that tells search engine and AI crawlers which paths they can and cannot access. The file uses User-agent and Allow/Disallow directives. robots.txt is the primary lever for granting or revoking AI crawler access to a site.
Crawlers read the file before any other URL on the site. Each User-agent block defines rules for that bot. Allow grants access; Disallow blocks it. The last block applies until the next User-agent declaration. Comments start with #.
Allow GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, CCBot and Applebot-Extended explicitly. Declare your sitemap.xml location at the bottom. Do not block JavaScript or CSS (Google needs them to render).
Use the Readiness Check which tests robots.txt for blocks against all major AI crawlers. Validate syntax with Google's robots.txt Tester in Search Console.
Run the free 50-signal AI Agent Readiness Check or book a free scoping call.
Score my site