How does the AI agent traffic detector work?

Crawler analytics (the existing GPTBot / ClaudeBot / PerplexityBot tab on /app/sites/{id}) catches indexer crawls - one-shot URL fetches. Agent traffic is different: an AI engine retrieving your page on behalf of a live user.

How we tell them apart:

UA fingerprint - chatgpt-user, oai-searchbot, claude-web, anthropic-ai, perplexity, gemini are agent-mode UAs (vs GPTBot / ClaudeBot which are indexers).
JA3 hash - if Cloudflare is in front, we get the TLS fingerprint and can dedupe by network identity.
Behavioural signals - headless browser, no mouse, no plugins, high fetch rate from a single IP.

Each detection is stored in agent_visits with a confidence score 0.0-1.0. The admin observability page at /admin/agent-traffic shows the last 30 days by detected agent + a recent table.

Turning it on:

/admin/features -> flip feature.agent_traffic.enabled to ON.
The existing /track endpoint (already on every monitored site) starts classifying every hit.
Confidence-threshold filters out the noise (only rows >= 0.6 are stored).

For higher-coverage detection (behavioural signals beyond UA), the JS pixel on /track.js will be extended in a follow-up to capture mouse / scroll / focus events and POST them with the visit.

How does the AI agent traffic detector work?

Was this helpful?