ChatGPT ( OpenAI )

Traffic

OpenAI has three distinct bots-

Bot
UserAgent
Tracking

GPTBot Used for crawling content that may be used in training OpenAI's generative AI foundation models.

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot

Current CIDRs- https://openai.com/gptbot.jsonarrow-up-right User agent may be a suitable backup, but is easily spoofed.

OAI-SearchBot Used for search functionality in ChatGPT's search features. It is not used to crawl content for training AI models.

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot

Current CIDRs- https://openai.com/searchbot.jsonarrow-up-right User agent may be a suitable backup, but is easily spoofed.

ChatGPT-User Used when users ask ChatGPT or a Custom GPT to visit a web page. It's not used for automatic crawling or AI training.

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot

Current CIDRs- https://openai.com/chatgpt-user.jsonarrow-up-right User agent may be a suitable backup, but is easily spoofed.

It also has direct user traffic when a user clicks a link from ChatGPT.

?utm_source=chatgpt.com

Tracking

Server-side

  1. Check signature to verify

Client-side

  1. Prefer CIDRs ( keep list updated )

OpenAI publishes a list of ChatGPT User CIDRs here-

https://openai.com/chatgpt-user.jsonarrow-up-right

These indicate the IP ranges

http://openai.com/searchbot.jsonarrow-up-right

https://openai.com/gptbot.jsonarrow-up-right

What the “ChatGPT-User” CIDRs represent

What they are not

  • They are not the training crawler (“GPTBot”) IPs. GPTBot is a separate crawler used for web data collection and is documented independently. (OpenAIarrow-up-right)

How to positively identify ChatGPT traffic (recommended over IP matching alone)

  • Validate the HTTP Message Signatures ChatGPT adds to outbound requests (Signature, Signature-Input, and Signature-Agent: "https://chatgpt.com"). This cryptographically proves a request came from ChatGPT. (OpenAI Help Centerarrow-up-right)

  • Major CDNs also expose verified detections (e.g., Vercel Verified Bots and Cloudflare’s bot directory entry for ChatGPT agent), which you can allowlist. (Vercelarrow-up-right)

Heads-up on the “ChatGPT-User” label

  • Many site owners also see the “ChatGPT-User” user-agent token associated with these requests; it’s widely documented by third-party bot directories. Use signatures for assurance, since UAs can be spoofed. (Dark Visitorsarrow-up-right)

Reference to the CIDR list itself

  • OpenAI publishes the “ChatGPT-User” IP ranges as a JSON feed (/chatgpt-user.json). Use that feed for the current list, but rely on signature verification when possible. (community.openai.comarrow-up-right)

Future

circle-exclamation

PostHog transforms (Hog) run on event payloads, not raw HTTP requests, and they don’t have access to inbound request headers (e.g. Signature, Signature-Input). They also can’t make outbound fetches to validate signatures. Therefore you can’t verify HTTP Message Signatures inside a PostHog transform.

Do signature verification at your edge/app (e.g., Cloudflare Worker, server, CDN function). When a request is verified as ChatGPT, attach a flag into the PostHog event you emit (or set a cookie/param your client capture reads). Then let Hog classify it first.

Minimal Hog change (add this to the top of your classifier):

Pipeline summary:

  1. Edge/server: verify ChatGPT HTTP message signature on the incoming request; if valid, include chatgpt_signature_verified=true when you send/capture the PostHog event (or expose it to the client so the JS capture can attach it).

  2. Hog: fall back to your other signals:

    • CIDR match → ai_traffic_type='chatgpt_user', ai_traffic_identifier='ip'

    • GPTBot UA/IP → ai_traffic_type='gptbot', identifier ua or ip

    • Referrer/UA (“chatgpt.com”, “chat.openai.com”, “ChatGPT”) → ai_traffic_type='chatgpt_click', identifier referrer or ua

This yields cryptographic certainty when available, with IP/UA/referrer as explicit fallbacks.

Last updated