Quick Answer: To measure your AI search citation share of voice, compute one number per engine, cited answers divided by total runs, then your share of the competitive field across ChatGPT, Claude, Gemini and Perplexity. A single blended score lies: Formative Digital's May 2026 scrape of 1,732 citations found cited sources barely overlap between engines.

Two flooring contractors operate eleven minutes apart in Brantford, Ontario. Ask the four big AI engines "who are the best flooring installers in Brantford" and the first shop is named by three of them; the second, with a tidier showroom and arguably better Google reviews, surfaces in exactly one. Neither owner could tell you this, because neither has measured it. Both assume AI search is going fine, and both are half blind. The gap between them is not luck. It is a measurable quantity called AI-citation share of voice, and almost nobody computes it correctly.

The measurement goes wrong because the engines do not agree with each other. When Formative Digital scraped 1,732 real AI citations across nine Ontario cities in May 2026, the headline was stark: across 583 distinct domains, only 16.3% were cited by two or more engines, leaving 83.7% unique to one. The second contractor is not invisible to AI in general; he is absent from three engines and present in a fourth, and a single blended "AI visibility score" would average that into mush. This page is the method we use to avoid that mistake, written so you can reproduce it. It sits inside Vector 11, Measure, of our 12 Vectors, and answers the question every competitor skips: not what share of voice is, but how to compute it per engine without lying to yourself.

Two pieces of prior work shape everything below. Kevin Indig's early-2026 Growth Memo established that about 44% of AI citations come from the first 30% of a page, which tells you what gets cited and therefore what you are measuring. The peer-reviewed GEO paper by Pranjal Aggarwal, Ameet Deshpande, Karthik Narasimhan and colleagues (arXiv:2311.09735) proved citation share is movable, liftable by up to 40% with targeted work, not a fixed property of your domain. If it moves, it has to be measured, honestly enough to tell whether your work moved it.

What exactly is AI-citation share of voice, and how is it different from a brand mention?

AI-citation share of voice is the percentage of relevant AI answers, inside a defined set of buyer questions, that cite your domain as a source, measured against the other businesses cited alongside you. It is a ratio, not a count. Run thirty prompts through one engine, get cited in nine of the answers, and your raw citation share for that engine is 30%. The competitive version divides your citations by the total across all named businesses, so it tells you not just whether you appear but how large your slice is relative to everyone fighting for the same answer space.

The distinction that trips people up is citation versus mention. A citation is a linked source: the answer points at your domain. A mention, sometimes called entity-based share of voice, is your brand name appearing in the prose without a link. The two diverge constantly. An engine can recommend "the team at Custom Contracting" by name while citing a directory page as its source, which counts as a mention for the contractor and a citation for the directory. Citation share tells you whether your own pages are doing the work; mention share tells you whether your reputation is doing it through someone else's page. Track both, separately, never blended.

Two share-of-voice metrics, kept separate

  • Citation share of voice: the proportion of answers that link your domain as a source. This is the metric you most directly control, because it depends on your own crawlable pages, and it is the leading indicator that your on-site GEO work is landing.
  • Mention (entity) share of voice: the proportion of answers that name your business, with or without a link. This captures reputation carried through directories and review sites, and can sit high even when your own site is never cited.
  • Why never to average them: a business at 5% citation and 40% mention share sits in a completely different position than one at 40% citation and 5% mention, yet a blended score treats them identically. The split is the insight.

Why must you measure each engine separately instead of one blended score?

You must measure each engine separately because the four engines read almost entirely different webs, so a blended score averages four genuine signals into one fictional one. This is the load-bearing finding from our own data, and the single reason most share-of-voice reporting is quietly wrong. In the Formative Digital scrape the grounding fingerprints barely overlapped: Gemini routed 384 of its citations through vertexaisearch.cloud.google.com, its own wrapper; ChatGPT reached for google.com 130 times, reading Maps and Knowledge Graph data; Claude leaned on the curated directory threebestrated.ca 116 times; Perplexity spread across homestars.com, opencare.com, furnaceprices.ca and bbb.org. Four engines, four source layers, one chat window each.

Independent research says the same from the retrieval side. Seer Interactive analysed more than 500 citations across roughly 100 queries and found over 87% of ChatGPT's SearchGPT citations matched Microsoft Bing's top organic results for the identical query, against only 56% for Google. Search Engine Land put it bluntly: Bing rank, not Google rank, predicts ChatGPT citations, so a business can dominate Google and stay invisible in ChatGPT. If the retrieval index differs per engine, citation share must be computed per engine, because you are measuring against different corpora. Blend them, and you cannot tell whether a 12% score means strong-in-two-and-absent-from-two or mediocre everywhere. Those demand opposite next moves.

A second, smaller reason also kills the blended number: the engines are probabilistic. Ask the same engine the same question twice and the list can reshuffle, so even within one engine a single reading is unreliable, which is why the method below samples each prompt several times. The blended score compounds both errors at once, cross-engine divergence and within-engine variance, into a figure that feels precise and means nothing. The full evidence behind the divergence sits in our study of why four AI engines name different local businesses for the same query; this page assumes that finding and builds on it.

Which prompts belong in your measurement set, and how many do you actually need?

The prompts that belong in your set are the real questions your customers ask an AI engine before they buy, phrased the way they actually phrase them, and for a single-city local business you usually need fifteen to thirty per engine. The number surprises people who expect hundreds. Local buyer intent clusters into a handful of recognisable shapes; once you cover the shapes, more near-duplicate prompts buy precision you lose again to run-to-run noise. Spend the effort on coverage and realism, not raw volume.

Three shapes cover most of the territory: ranked-list queries ("best HVAC contractors in Mississauga"), where directories and "best of" pages dominate; near-me and locality queries ("furnace repair near me in Hamilton"), which lean on Maps-grounded data; and problem-plus-service queries ("who can fix a leaking flat roof in Brantford"), which pull in firm sites that wrote about the specific problem. Represent each shape in proportion to how your customers actually search, define the geographic scope precisely, and then freeze the set. The frozen prompt set is the backbone of the method. Change the prompts and you have changed the ruler, so month-over-month movement stops meaning anything.

How did Formative Digital scrape 1,732 citations, and what is the per-engine share formula?

Formative Digital scraped the citations by running a frozen prompt set through DataForSEO's LLM endpoints against the live engines in May 2026, then parsing every cited domain out of the responses into a database. The job produced 176 successful queries, 1,732 individual citations, and 583 distinct domains, every prompt a real local-intent query of the form "who are the best {vertical} in {city}, Ontario," across nine Ontario cities, five verticals, and four engines. The point of an API rather than typing by hand is repeatability: the same prompt fired programmatically several times, citation list captured each time, is the only way to compare across months without your typing introducing drift. You do not need DataForSEO to reproduce this on one business; the manual version is the same logic by hand, paste each frozen prompt into each engine and record which businesses were named, which domains were cited, and where you sat. Tools that automate the capture layer (Otterly.ai, Profound, Semrush's AI Visibility Toolkit, Ahrefs Brand Radar, the HubSpot AEO Grader) save hours, but none do the judgement work: choosing prompts that match real intent, and reading four per-engine numbers as four truths instead of one dashboard gauge.

The formula, per engine, is cited answers divided by total runs, then your share of the competitive field. Worked plainly: thirty prompts run three times each is ninety answers per engine. Cited in eighteen of those ninety, your raw citation share is 20%. If every named competitor collected ninety citations between them while you held eighteen, your competitive share of voice is 18 divided by 90, again 20% of the citation space. Compute that once per engine and you get four different numbers; the spread is the most useful thing on the page. Three runs is the practical floor and five is better, because these models are non-deterministic. A prompt fired once gives a coin-flip impression, not a measurement, and repeated-run averaging is what separates a real number from a screenshot somebody took once and over-trusted.

Only after you have four honest per-engine numbers should you build an index, and even then you weight rather than average. A flat average treats Gemini's 5% of real audience the same as ChatGPT's dominant share, distorting the result toward engines that barely matter to your customers. Industry consensus, reflected in Semrush's share-of-voice methodology, computes share per response across a fixed prompt set, then weights by where audiences actually are. A defensible 2026 weighting for a North American local business looks roughly like the table below, a starting point you tune to your own audience data, not a law.

A worked per-engine share, then a weighted index

EngineSuggested 2026 weightExample citation shareContribution to index
ChatGPT35%28%9.8
Google AI Overviews30%15%4.5
Perplexity20%22%4.4
Claude10%6%0.6
Gemini5%40%2.0
Weighted AI share-of-voice index21.3

Read the index as a convenience, never the answer. The example business scores a respectable 21.3 overall, yet the row that matters is Claude at 6%, a fixable weakness the index alone would bury. Keep the four numbers visible underneath any single figure you report upward.

How do you read the result, and how often should you re-measure and what do you log each cycle?

You read the result by comparing each per-engine share against the competitors that appear in that engine's answers, not against an absolute target, because what counts as strong is relative to how crowded your category is in that engine. In a thin vertical where only four or five businesses get cited, a 25% share can make you a co-leader; in a dense category in a large city, the same 25% might put you third. The first read is always who else holds that engine's answer space, and how much. A meaningful share is one climbing relative to your named competitors over consecutive cycles, on the engines your audience uses. Thresholds borrowed from another category mislead you.

Re-measure monthly by default, plus after any change that should move the needle: new reviews, a directory placement, a site restructure, a content refresh. Daily measurement mostly samples the probabilistic wobble and invites you to chase noise. A monthly cycle, prompt set frozen and each prompt sampled three to five times, separates signal from variance. Each cycle, log the same fields so the trend is real and not an artifact of changing your method: the date, the prompt set, the runs per prompt, the four citation shares, the four mention shares, the competitor set, and the domains cited most. Watch the source layer as a leading indicator, because it moves before the citation share does. Earn a threebestrated.ca placement this month and Claude shifts first, but only if you logged the layer, not just the headline percentage.

One thing not to do: trust a rank-style proxy as a shortcut. Ahrefs analysed roughly 863,000 keyword SERPs and about four million AI Overview URLs and found only 38% of AI-Overview-cited pages also ranked in the organic top 10, while 31% did not rank in the top 100 at all. A rank tracker cannot stand in for AI citation share; it has to be measured directly, per engine, which is the whole reason this method exists. One scope note before you start: this page is the repeating Vector 11 measurement loop, the monthly share-of-voice cadence you run for the life of the account, not the one-off baseline audit. That single snapshot is a separate job, the Vector 1 starting audit that maps where each engine surfaces you today, and it produces the first data point this loop then tracks month over month.

"The number everyone wants is a single AI visibility score, and that number does not exist. We measured four engines reading four different slices of the web, and 83.7 percent of what they cited lived in only one of them. So the honest deliverable is four numbers and a weighting you can defend, refreshed on a schedule, each read against the competitors who share that engine's answer space. A blended gauge feels like progress and tells you nothing. Four per-engine shares, sampled enough times to be real, tell you which lever to pull next. That is the whole of Vector 11. Truth, not tricks."

Matt Griffin, Founder, Formative Digital, Brantford, Ontario

One caveat belongs here, because owners spend real money acting on these numbers. A share-of-voice figure is a measurement, not a promise: outcomes depend on your industry, your competition, and your existing digital presence, and AI visibility does not move identically for every business. Our Brantford retail client Mattress Miracle grew from roughly 1,000 to more than 82,400 monthly organic visits (SEMrush, April 2026) through sustained structured-content work, which reflects one industry and one starting point. What this method guarantees is not a result. It is an honest reading of where you stand, on each engine, that you can act on and re-check. For the detail behind the numbers, our breakdowns of the source mix Perplexity pulls from on local queries and how Gemini grounds local answers through Vertex show what you are really measuring inside each engine.

Frequently Asked Questions

What is share of voice in AI search?

AI search share of voice is the percentage of relevant AI answers that cite or recommend your business, measured against the competitors that appear alongside you. There are two versions: citation-based share of voice counts the answers that actually link to your domain, and entity-based or mention share of voice counts the answers that name your brand even without a link. The honest version is computed per engine, across a fixed prompt set, and averaged over several runs, because Formative Digital's May 2026 scrape found 83.7 percent of cited sources were unique to a single engine, so one blended number hides more than it shows.

How many prompts do I need to track to measure AI share of voice accurately?

For a single local service business in one city, a set of fifteen to thirty real buyer prompts per engine is usually enough to produce a stable share-of-voice figure, provided you run each prompt several times rather than once. The prompts should mirror how customers actually ask, mixing best-in-city queries, near-me phrasing, and service-plus-problem questions. Quality matters more than raw volume: a focused set of the questions that drive your category, sampled repeatedly, beats hundreds of prompts run a single time, because the run-to-run variance in these engines is large enough to swamp a one-shot reading.

How often should I measure my AI search share of voice?

Monthly is the right default cadence for most local businesses, with a fresh measurement after any major change such as a new batch of reviews, a directory placement, or a site restructure. AI answer engines update their grounding sources and models on their own schedules, so a daily check mostly captures run-to-run noise rather than real movement. A monthly cycle, with each prompt sampled several times and the same prompt set held constant, separates genuine change from the wobble and gives you a trend you can act on rather than react to.

Sources

  1. Seer Interactive. (2025). 87% of SearchGPT Citations Match Bing's Top Results. Analysis of 500+ citations across roughly 100 queries; 87%+ SearchGPT-Bing match versus 56% for Google. Seer Interactive
  2. Search Engine Land. (2025). Bing, not Google, shapes which brands ChatGPT recommends. Bing rank predicts ChatGPT citations; 87% align with Bing's top results. Search Engine Land
  3. Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K., & Deshpande, A. (2024). GEO: Generative Engine Optimization. Targeted optimisation lifts generative-answer visibility by up to 40%. arXiv:2311.09735. arXiv:2311.09735
  4. Ahrefs. (2025). Update: 38% of AI Overview Citations Pull From The Top 10. 863,000 keyword SERPs, ~4M AI Overview URLs; 38% of cited pages rank top 10, 31% rank outside top 100. Ahrefs
  5. Semrush. (2025). How to Measure AI Share of Voice. Share of voice as a function of brand mentions and position within each AI response, computed per response and averaged. Semrush

Get Your Free AI Visibility Audit

Formative Digital, Brantford, Ontario

We run your real customer prompts through ChatGPT, Claude, Gemini and Perplexity, sample each several times, and hand you four separate citation-share numbers with the competitor set and source layers behind each. You see which engine you are winning, which you are absent from, and what to fix first. You keep the report whether you work with us or not.

Request Your Free AI Visibility Audit