Service Marketing Co.

Opening Your Website to GPTBot, ClaudeBot, PerplexityBot, and Google-Extended in 2026

You’ve worked hard on your website. Your local SEO is dialed in. Your service pages rank decently in Google. Then you open ChatGPT or Claude and ask “who is the best HVAC contractor near me.” Your business is nowhere in the answer.

That gap is not a content problem. It’s a permissions problem.

Most service business websites are technically blocking the AI bots that decide what shows up in AI search results. Sometimes the block is intentional. More often, it’s an accidental side effect of running a default WordPress robots.txt or an old SEO plugin from 2023.

In this article, we’ll walk you through the four AI crawlers that matter most in 2026. We’ll show you the exact directives to add. We’ll cover the adjacent retrieval bots that work alongside them. And we’ll show you how to verify the bots are actually showing up in your server logs.

If you’d rather have us handle the AI search setup, we’ll point you to the next step at the end.

Why Your Website Is Probably Invisible to AI Search

Most service business owners don’t realize one important thing. AI search engines don’t crawl your site the way Google does.

Google has been crawling the open web since 1998. Most websites trust Googlebot by default and let it through without thinking about it. The AI bots that power ChatGPT, Claude, Perplexity, and Google’s Gemini are newer. They have different names. They have different rules. And they have different opt-in behaviors.

If your robots.txt file doesn’t explicitly mention them, your site is in a gray zone. Some bots will crawl anyway. Others (like Google-Extended) treat silence as a “no” and skip your site entirely.

The downstream effect is invisibility. According to Cloudflare’s crawler analysis, GPTBot is now referenced in the robots.txt of nearly 21% of the top 1,000 websites. Your competitors are taking sides on this. The ones who allowed AI crawlers eighteen months ago are getting cited inside AI answers right now.

The cost of staying silent is compounding every month.

How AI Crawlers Actually Work in 2026

The biggest shift in 2026 is that the major AI labs have all moved to a multi-bot architecture. One company runs several different bots. Each bot does a different job. And you can decide separately which ones to allow.

Here is the current breakdown:

  • OpenAI runs three bots. GPTBot handles training. OAI-SearchBot handles ChatGPT search indexing. ChatGPT-User fetches live pages when a ChatGPT user asks about your business in real time.
  • Anthropic runs three bots. ClaudeBot handles training. Claude-SearchBot handles search indexing inside Claude. Claude-User handles real-time retrieval when a Claude user asks a question.
  • Perplexity runs two bots. PerplexityBot handles indexing. Perplexity-User handles live retrieval.
  • Google runs Google-Extended. This single user agent controls whether your content can train Gemini and influence Google AI Overviews. Google-Extended is fully separate from Googlebot (per Google Search Central). Allowing it has zero effect on your normal Google rankings.

The reason this architecture matters is that it gives you choices. You can allow live retrieval bots without allowing training bots. For most service businesses, the right answer is to allow all of them. AI search visibility compounds. The ranking benefits show up faster when AI systems can both train on and live-fetch your content.

The Exact robots.txt Directives to Add

Your robots.txt file lives at the root of your domain. For example: https://your-website.com/robots.txt. If you can open it in a browser, the bots can read it.

Paste this block into your robots.txt to allow the four primary AI crawlers across your entire site:

User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /

A few things to know about this block:

  1. The order doesn’t matter. Keeping each user agent on its own group makes the file easier to audit later.
  2. Allow: / means the bot can access every page on your site. Add a Disallow: line under the user agent for any path you want to keep private (like /admin or /thank-you).
  3. You don’t need to remove other rules. Your existing Googlebot and Bingbot directives stay exactly as they are.

If you use Yoast or Rank Math, look for the “edit robots.txt” option inside the plugin settings. On WordPress.com sites, the file is editable through Jetpack tools. On a custom-built site, upload a new robots.txt to your root by FTP or your hosting control panel.

Allow the Retrieval Bots Too

This is where most AI optimization guides stop. We’re going one layer deeper. The retrieval bots (the ones that fetch your page in real time) are where the fastest wins live.

A homeowner types “tune up my furnace in San Diego” into ChatGPT. ChatGPT-User then fetches relevant pages in real time. If your site blocks ChatGPT-User, you can’t be cited in that answer. The same logic applies to Claude-User and Perplexity-User.

Add this block underneath the first one:

User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-SearchBot
Allow: /
User-agent: Claude-User
Allow: /
User-agent: Perplexity-User
Allow: /

Combined with the four primary crawlers above, you now have nine AI user agents explicitly allowed. This covers training, search indexing, and live retrieval across OpenAI, Anthropic, and Perplexity.

Verify the Bots Are Actually Crawling Your Website

Adding the directives is step one. Confirming the bots are showing up is step two.

The simplest way to verify is to check your server access logs. On cPanel, look for “Raw Access Logs.” On a managed host like WP Engine or Kinsta, check the analytics dashboard. Behind Cloudflare, the bot activity report is the cleanest view.

Filter the logs by user agent. Look for the strings GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and the retrieval bot names. Crawl activity should appear within a few days of updating your robots.txt.

If a week passes and you see no AI bot traffic, double-check three things:

  1. Your robots.txt file is publicly readable at your-domain.com/robots.txt.
  2. There’s no firewall or security plugin (Wordfence, Sucuri) blocking traffic from cloud IP ranges.
  3. Your sitemap is submitted and reachable. Most AI bots use sitemaps to prioritize what to crawl first.

You can also run a manual test. Open ChatGPT, Claude, and Perplexity. Ask each one a question that should naturally surface your business. If your site is allowed, the AI will often cite your URL or pull a snippet from your page. Repeat this audit monthly.

Where AI Search Indexing Is Headed

Allowing AI bots is no longer a maybe. It’s the baseline.

The next layer of AI search optimization is structured trust signals. Schema markup, llms.txt files, and a clean Quick Answer block at the top of every key page. We’ve covered those moves in our field guide to mastering AIO, AEO, and GEO. That post is the pillar resource for service business owners getting serious about AI visibility in 2026.

For service businesses in the trades, the next move after robots.txt is auditing your full digital footprint. Our HVAC marketing hub walks through the rest of the AI-era playbook, from Google Business Profile to schema. Both pieces work together.

Cloudflare and other infrastructure providers are rolling out paid marketplaces where AI companies pay publishers per crawl. That market is still early. The first businesses to benefit will be the ones with clean robots.txt files and strong content in the AI pipeline.

Get the foundation right now, and the future moves get easier.

Need help or guidance with your AI search optimization or service business marketing? Reach out to us at servicemarketing.co and schedule a time to chat together. We’re here to help you grow.

Get weekly insights

HVAC marketing moves fast. Get weekly insights on SEO, seasonal trends, and what’s actually working for service businesses right now.

Stay ahead of HVAC marketing trends

Occasional insights on HVAC marketing, SEO, and what’s actually working for service businesses. No fluff, no spam.

Discover more from Service Marketing Co.

Subscribe now to keep reading and get access to the full archive.

Continue reading