How does the AI agent traffic detector work?

Detects ChatGPT browse / Claude tool-use / Perplexity sonar visits in real time, distinct from crawler analytics.

Intermediate Last updated 3 May 2026

Crawler analytics (the existing GPTBot / ClaudeBot / PerplexityBot tab on /app/sites/{id}) catches indexer crawls - one-shot URL fetches. Agent traffic is different: an AI engine retrieving your page on behalf of a live user.

How we tell them apart:

  1. UA fingerprint - chatgpt-user, oai-searchbot, claude-web, anthropic-ai, perplexity, gemini are agent-mode UAs (vs GPTBot / ClaudeBot which are indexers).
  2. JA3 hash - if Cloudflare is in front, we get the TLS fingerprint and can dedupe by network identity.
  3. Behavioural signals - headless browser, no mouse, no plugins, high fetch rate from a single IP.

Each detection is stored in agent_visits with a confidence score 0.0-1.0. The admin observability page at /admin/agent-traffic shows the last 30 days by detected agent + a recent table.

Turning it on:

  1. /admin/features -> flip feature.agent_traffic.enabled to ON.
  2. The existing /track endpoint (already on every monitored site) starts classifying every hit.
  3. Confidence-threshold filters out the noise (only rows >= 0.6 are stored).

For higher-coverage detection (behavioural signals beyond UA), the JS pixel on /track.js will be extended in a follow-up to capture mouse / scroll / focus events and POST them with the visit.

Was this helpful?