How to Audit Your Brand on ChatGPT, Claude, and Perplexity

How to Audit Your Brand on ChatGPT, Claude, and Perplexity

Published on June 1, 2026 | by Vijay

Most teams that run their first AI brand audit walk away with one of two reactions. The first is mild relief, the kind that comes when ChatGPT does at least name the brand and gets the category roughly right. The second is the harder one, where the model confuses you with a competitor, lists you under the wrong use case, or doesn’t surface you at all on a query your prospects ask every day.

Either reaction is more useful than not knowing. The audit itself isn’t complicated. It’s the discipline of doing it consistently that most teams skip. Once you build the rhythm, the audit becomes the baseline that tells you whether your other AI visibility work is moving the needle.

What an AI brand audit actually covers

A useful audit answers three questions for each major AI engine. Does your brand appear when buyers ask category-relevant questions? Is the description accurate when it does appear? How does the answer compare across engines for the same prompt?

Each engine handles questions differently, and the differences matter. Understanding which AI to use for which task is part of the picture, but the audit is about how the engines treat your brand, not which one your team prefers internally. ChatGPT leans on encyclopedic and synthesis-friendly sources. Perplexity weights community discussion and review platforms more heavily. Claude tends toward authoritative, structured content with longer reasoning chains. Each will represent your brand differently for the same prompt.

Step 1: Build the prompt set

Start with 20-30 buyer-intent prompts that map to real questions in your category. Not generic “what is X” queries; the actual questions a prospect would ask when researching options.

A useful mix:

  • Category-level questions (“what are the best tools for X”)
  • Use-case-specific questions (“which tool fits Y workflow”)
  • Comparison questions (“how does Brand A compare to Brand B”)
  • Problem-led questions (“what should I do when Z happens”)
  • Edge questions where you have a credible position but aren’t an obvious incumbent

Pull these from sales call transcripts, Search Console queries with question patterns, Reddit threads in your category, and your own buyers’ history. Don’t overthink it. The list will evolve once you start seeing what the engines surface.

Step 2: Run the prompts across each engine

The mechanics are simple. Open ChatGPT, Perplexity, and Claude in separate windows and run each prompt fresh, without prior chat history influencing the answer. Capture the response, the position of your brand mention, the description used, and any competitors named.

A small detail that matters: run the prompts at the same time of day across engines if possible, since results can shift slightly with how the underlying systems route. Don’t tune the prompts for each engine. The audit’s value is seeing how the same question produces different answers across the platforms.

For each engine, capture three things per prompt:

  • Did your brand appear at all (yes/no)
  • If yes, where in the answer (first paragraph, mid-answer, tail)
  • How was the brand described (accurately, partially, inaccurately, omitted)

A spreadsheet handles this fine. The point is consistency over sophistication.

Step 3: Compare across engines

The cross-engine view is where the most interesting patterns surface. A brand that shows up confidently on Perplexity but is entirely missing on ChatGPT often has good third-party review presence but weak Wikipedia or general-source coverage. A brand that appears on ChatGPT but not Claude usually has surface-level mentions without the depth of structured content that Claude weights more heavily.

Pay particular attention to where the same prompt produces different competitor sets. If Perplexity surfaces Brand A as your main competitor and ChatGPT surfaces Brand B, the model’s mental map of your category isn’t the same across engines. That tells you something about which sources are shaping the perception.

Watch for shifts after model updates too. The launch of GPT-5 and the subsequent ChatGPT-5 changes introduced meaningful differences in how the model surfaces brands and categories. Audits run before and after major model updates often look different even when nothing on your end has changed.

Step 4: Score and prioritise

Once you have the raw data, the next step is scoring. A simple framework that holds up:

  • Citation rate per engine (percentage of prompts that surfaced your brand)
  • Description accuracy score (1-5 average across the prompts where you appeared)
  • Position score (weighted higher for first-paragraph mentions)
  • Cross-engine consistency (how many engines surfaced you for each prompt)

Aggregate these into a per-engine summary and a cross-engine summary. The per-engine view tells you where your weakest presence is. The cross-engine view tells you which prompts are your biggest opportunities.

Then identify the top three prompts where you should appear but don’t, and the top three where the description is materially wrong. Those become the work queue for the next quarter.

Step 5: Build the recurring rhythm

The first audit is informative. The second one tells you whether anything is moving. The third onward starts producing the trend lines that make the audit valuable for QBR conversations.

A workable cadence is monthly for the prompt set, with a deeper quarterly review where you refresh the prompt list, add new buyer questions you’ve seen surface, and retire any that no longer reflect real research patterns. The engines themselves change quickly enough that the prompt set should evolve too.

What the audit usually reveals

The patterns that show up across teams running this for the first time are consistent. Citation rates are usually lower than expected. Description accuracy is patchier than expected. Cross-engine consistency is the lowest-scoring dimension for almost every brand. Most brands appear on one engine far more reliably than the others without realising it.

None of these are catastrophic findings. They’re starting points. The teams that take the audit seriously and act on the gaps tend to see meaningful movement within two or three quarters. The work isn’t in running it; it’s in coming back to it consistently and treating the trend lines as something to act on.