Free tool · Checks every AI bot
robots.txt checker for AI bots
We check whether GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and 10 other AI crawlers are allowed on your site - then hand you a UK-SMB-friendly recommended robots.txt.
Analysis
Found a robots.txt at
Found a robots.txt at https://www.bbc.co.uk/robots.txt
Issues:
- GPTBot is blocked via an explicit "User-agent: GPTBot" group.
- OAI-SearchBot is blocked via an explicit "User-agent: OAI-SearchBot" group.
- ChatGPT-User is blocked via an explicit "User-agent: ChatGPT-User" group.
- ClaudeBot is blocked via an explicit "User-agent: ClaudeBot" group.
- PerplexityBot is blocked via an explicit "User-agent: PerplexityBot" group.
- Perplexity-User is blocked via an explicit "User-agent: Perplexity-User" group.
- Google-Extended is blocked via an explicit "User-agent: Google-Extended" group.
- CCBot is blocked via an explicit "User-agent: CCBot" group.
- Applebot-Extended is blocked via an explicit "User-agent: Applebot-Extended" group.
- Bytespider is blocked via an explicit "User-agent: Bytespider" group.
- Meta-ExternalAgent is blocked via an explicit "User-agent: Meta-ExternalAgent" group.
| AI crawler | Status | What it does |
|---|---|---|
| GPTBot | ✗ BLOCKED | OpenAI training crawler |
| OAI-SearchBot | ✗ BLOCKED | OpenAI search index crawler |
| ChatGPT-User | ✗ BLOCKED | OpenAI on-demand fetch (when a user asks ChatGPT to read your page) |
| ClaudeBot | ✗ BLOCKED | Anthropic training crawler |
| Claude-User | ✓ Allowed | Anthropic on-demand fetch |
| Claude-SearchBot | ✓ Allowed | Anthropic search index crawler |
| PerplexityBot | ✗ BLOCKED | Perplexity search index crawler |
| Perplexity-User | ✗ BLOCKED | Perplexity on-demand fetch |
| Google-Extended | ✗ BLOCKED | Google AI Overviews / Gemini training opt-in token |
| CCBot | ✗ BLOCKED | Common Crawl (used by many AI training pipelines) |
| Applebot-Extended | ✗ BLOCKED | Apple Intelligence training opt-in token |
| Bytespider | ✗ BLOCKED | ByteDance / Doubao training crawler |
| Meta-ExternalAgent | ✗ BLOCKED | Meta AI training crawler |
| DuckAssistBot | ✓ Allowed | DuckDuckGo AI assistant |
Sitemap declarations found:
- https://www.bbc.co.uk/sitemap.xml
- https://www.bbc.co.uk/sitemaps/https-index-uk-archive.xml
- https://www.bbc.co.uk/sitemaps/https-index-uk-news.xml
- https://www.bbc.co.uk/food/sitemap.xml
- https://www.bbc.co.uk/bitesize/sitemap/sitemapindex.xml
- https://www.bbc.co.uk/teach/sitemap/sitemapindex.xml
- https://www.bbc.co.uk/sitemaps/https-index-uk-archive_video.xml
- https://www.bbc.co.uk/sitemaps/https-index-uk-video.xml
- https://www.bbc.co.uk/sitemaps/sitemap-uk-ws-topics.xml
- https://www.bbc.co.uk/sport/sitemap.xml
- https://www.bbc.co.uk/sitemaps/sitemap-uk-topics.xml
- https://www.bbc.co.uk/ideas/sitemap.xml
- https://www.bbc.co.uk/tiny-happy-people/sitemap/sitemapindex.xml
Recommended
Drop-in robots.txt
UK-SMB-friendly defaults: every AI crawler explicitly allowed by name; admin / cart / checkout paths blocked. Tweak to taste.
# robots.txt - generated by llmsubmitter.com # UK-SMB-friendly defaults: AI crawlers explicitly allowed by name, # admin / transactional paths blocked. Edit to taste. User-agent: * Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: GPTBot Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: OAI-SearchBot Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: ChatGPT-User Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: ClaudeBot Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: Claude-User Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: Claude-SearchBot Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: PerplexityBot Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: Perplexity-User Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: Google-Extended Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: CCBot Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: Applebot-Extended Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: Bytespider Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: Meta-ExternalAgent Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= User-agent: DuckAssistBot Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /api/ Disallow: /admin/ Disallow: /wp-admin/ Disallow: /login Disallow: /logout Disallow: /search? Disallow: /*?utm_ Disallow: /*?ref= Sitemap: https://www.bbc.co.uk/sitemap.xml # llms.txt: https://www.bbc.co.uk/llms.txt
Your current robots.txt
For reference
# version: 36756f545af9144e59780f65727af4ba98093b3b # The BBC's Terms of Use: https://www.bbc.co.uk/terms # - Explain the rules for using our services # - Tell you what you can do with our content # # In short: Please use our site like a human, not a robot. # That means: # - No scraping, crawling, or systematic extraction of content # - No use of BBC content for training or fine-tuning AI models, including large language models (LLMs) # - No retrieval-augmented generation (RAG), AI-powered search, agentic AI or grounding using BBC content # - No creating datasets from BBC content # - No text and data mining (TDM) under Article 4 of the EU Directive on Copyright in the Digital Single Market # - No using BBC content to create summaries for your own use # - No business use without permission (details: https://www.bbc.co.uk/usingthebbc/terms/can-i-use-bbc-content-for-my-business/) # - The BBC reserves all rights in its content and expressly opts out of any statutory exceptions in any jurisdiction for text and data mining, as permitted by law # TL;DR: Browse, read, watch, enjoy - like a human. # # HTTPS www.bbc.co.uk User-agent: * Sitemap: https://www.bbc.co.uk/sitemap.xml Sitemap: https://www.bbc.co.uk/sitemaps/https-index-uk-archive.xml Sitemap: https://www.bbc.co.uk/sitemaps/https-index-uk-news.xml Sitemap: https://www.bbc.co.uk/food/sitemap.xml Sitemap: https://www.bbc.co.uk/bitesize/sitemap/sitemapindex.xml Sitemap: https://www.bbc.co.uk/teach/sitemap/sitemapindex.xml Sitemap: https://www.bbc.co.uk/sitemaps/https-index-uk-archive_video.xml Sitemap: https://www.bbc.co.uk/sitemaps/https-index-uk-video.xml Sitemap: https://www.bbc.co.uk/sitemaps/sitemap-uk-ws-topics.xml Sitemap: https://www.bbc.co.uk/sport/sitemap.xml Sitemap: https://www.bbc.co.uk/sitemaps/sitemap-uk-topics.xml Sitemap: https://www.bbc.co.uk/ideas/sitemap.xml Sitemap: https://www.bbc.co.uk/tiny-happy-people/sitemap/sitemapindex.xml Disallow: /asset/ Disallow: /backstage/bbc-login-help/ Disallow: /backstage/bbc-login-help$ Disallow: /bitesize/search$ Disallow: /bitesize/search/ Disallow: /bitesize/search? Disallow: /cbbc/search$ Disallow: /cbbc/search/ Disallow: /cbbc/search? Disallow: /cbeebies/search$ Disallow: /cbeebies/search/ Disallow: /cbeebies/search? Disallow: /chwilio/ Disallow: /chwilio$ Disallow: /chwilio? Disallow: /iplayer/bigscreen/ Disallow: /iplayer/cbbc/episodes/ Disallow: /iplayer/cbbc/search Disallow: /iplayer/cbeebies/episodes/ Disallow: /iplayer/cbeebies/search Disallow: /iplayer/search Disallow: /indepthtoolkit/smallprox$ Disallow: /indepthtoolkit/smallprox/ Disallow: /moderation/reports/ Disallow: /modules/musicnav/language/ Disallow: /news/0 Disallow: /radio/aod/ Disallow: /radio/aod$ Disallow: /radio/imda Disallow: /radio/player/ Disallow: /radio/player$ Disallow: /search/ Disallow: /search$ Disallow: /search? Disallow: /sport/alpha/ Disallow: /sounds/player/ Disallow: /sounds/player$ Disallow: /ugc$ Disallow: /ugc/ Disallow: /ugcsupport$ Disallow: /ugcsupport/ Disallow: /userinfo/ Disallow: /userinfo Disallow: /food/favourites Disallow: /food/menus/*/shopping-list Disallow: /food/recipes/*/shopping-list Disallow: /food/search*?* Disallow: /sounds/search$ Disallow: /sounds/search/ Disallow: /sounds/search? Disallow: /ws/includes Disallow: /rd/search$ Disallow: /rd/search/ Disallow: /rd/search? Disallow: /things/search$ Disallow: /things/search/ Disallow: /things/search? User-agent: Amazonbot Disallow: / User-agent: magpie-crawler Disallow: / User-agent: CCBot Disallow: / User-Agent: omgili Disallow: / User-Agent: omgilibot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: Claude-Web Disallow: / User-agent: anthropic-ai Disallow: / User-agent: cohere-ai Disallow: / User-agent: Bytespider Disallow: / User-agent: PetalBot Disallow: / User-agent: Scrapy Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: Google-CloudVertexBot Disallow: / User-agent: Google-Extended Disallow: / User-Agent: PerplexityBot Disallow: / User-agent: Perplexity-User Disallow: / User-agent: meta-externalagent Disallow: / User-agent: OAI-SearchBot Disallow: / User-agent: YandexAdditional Disallow: / User-agent: YandexAdditionalBot Disallow: / User-agent: TurnitinBot Disallow: / User-agent: Brightbot Disallow: / User-agent: ApifyBot Disallow: / User-agent: ApifyWebsiteContentCrawler Disallow: / User-agent: Diffbot Disallow: / User-agent: Diffbot-User Disallow: / User-agent: ExaBot Disallow: / User-agent: TavilyBot Disallow: / User-agent: ShapBot Disallow: / User-agent: YouBot Disallow: / User-agent: FirecrawlAgent Disallow:/ User-agent: Amzn-SearchBot Disallow: / User-agent: Amzn-User Disallow: / User-agent: ProRataInc Disallow: / User-agent: CloudflareBrowserRenderingCrawler Disallow: / User-agent: AhrefsBot Disallow: /