Quick Answer: Retrieval-augmented generation (RAG) is the step where an AI engine fetches live web pages, then writes only from what it pulled. Whether your business is named is decided there, by retrieval, not by training data. In Formative Digital's scan of 1,732 AI citations across nine Ontario cities, 83.7% of sources were unique to one engine.
That last number is the whole article in one statistic. When four engines read four different slices of the web, being "good at AI" is not one task. It is the work of getting retrieved by each engine, then surviving the read. The table below shows how unevenly that plays out.
| Engine | Top source it routed through | Times cited | What it reveals |
|---|---|---|---|
| Google Gemini | vertexaisearch.cloud.google.com | 384 | A literal Vertex AI Search retrieval endpoint, grounding made visible |
| ChatGPT | google.com | 130 | Leans on Google Maps and the Knowledge Graph |
| Anthropic Claude | threebestrated.ca | 116 | Leans on curated local directories |
| Perplexity | homestars.com | 17 | Spreads across review directories and trade listings |
You are reading four different definitions of "the web" in one table. None of it is decided by what the models learned in training. It is decided at retrieval. Here is what that means for the question every owner asks: why am I named here and not there?
On this page
- What is retrieval-augmented generation, in plain English?
- Training data or live retrieval: which one decides if you are named?
- How does RAG grounding choose which sources to read?
- Why does Gemini cite a Vertex endpoint while ChatGPT cites google.com?
- Why do four engines name four different businesses for one question?
- Is being retrieved the same as ranking number one on Google?
- What signals make a local business retrievable, not just rankable?
- Where on the page does a grounded engine actually read?
- How do you get named once the engine has retrieved you?
- Frequently asked questions
What Is Retrieval-Augmented Generation, In Plain English?
Retrieval-augmented generation is when an AI engine searches the live web for a small set of relevant pages, then writes its answer using only what it just read. Two steps live inside that one phrase. Retrieval is the search: the engine fetches pages. Generation is the writing: the model phrases a reply. The "augmented" part means the model's own memory is being topped up with fresh pages before it speaks.
Amazon Web Services puts it precisely. RAG is "the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response." AWS adds that RAG "redirects the LLM to retrieve relevant information from authoritative, pre-determined knowledge sources" before it answers. Google Cloud describes the same architecture: a generative model connected to external knowledge so it grounds each response in documents retrieved at query time rather than relying only on what it learned in training.
For a business owner, the plain version is shorter. Ask a modern AI engine "who are the best roofers in Hamilton" and it does not recite a memorised list. It runs a search, pulls a handful of pages, and reads the names off them. If your business is on a page it pulled, you can be named. If not, you cannot. That single mechanic governs the rest of this article.
Training Data Or Live Retrieval: Which One Decides If You Are Named?
Live retrieval decides it, not training data. This is the point most agency content gets backwards. The training data shaped how the model writes and reasons. It did not memorise your local market in a way that controls today's answer. The businesses an engine names come from the pages it fetched the minute you asked.
The proof is in the timing. A roofing company that launched its website three months ago has nothing in any model's training corpus, which was frozen long before. Yet it can be named today, because retrieval pulls its directory listing live. The reverse holds too: a long-established firm that is invisible to retrieval, no clean listings, no structured data, no presence on the directories the engine pulls, goes unnamed no matter how famous it is locally. Memory is not the lever. The fetch is the lever.
Brad, Owner, Mattress Miracle, Brantford, ON: "In 40 years of advertising I've never seen anything like this. It's a completely new business."
Brad's reaction names the shift without the jargon. The rules that governed forty years of advertising, who has the biggest sign, the longest reputation, the most spent on print, do not decide what an AI engine reads. A fetch does. Results like Mattress Miracle's depend on industry, competition, and existing digital presence, and you can read the full account in our write-up of that Brantford retail engagement. The mechanism underneath it is retrieval, which is something you can earn.
How Does RAG Grounding Choose Which Sources To Read?
Grounding chooses sources by matching your question to candidate pages, ranking them, and pulling a small set the engine treats as trustworthy. The engine rewrites your question into one or more searches, scores candidate pages for how well they match both the words and the entity behind them, and retrieves the top few, often four to eight. The answer is then bound to that retrieved set. The model cannot invent a name that is not in the pages it pulled.
What earns a page into that retrieved set is clarity, not cleverness. Three things weigh heavily. Entity clarity: the page names the business plainly and consistently, so the engine is confident which entity it is reading about. Structured data: LocalBusiness schema and Schema.org markup hand the engine machine-readable facts instead of asking it to guess from prose. Citation-ready structure: the page states answers in extractable sentences a synthesis step can lift cleanly. Directories the engine already trusts get pulled first, which is why a single strong listing can matter more than a redesign.
The query fan-out problem makes this sharper. One question often becomes several searches behind the scenes, and each search retrieves a slightly different set of pages. A clearly described business shows up across many of those fanned-out searches. A business with thin, inconsistent listings shows up in none. Retrieval is a clarity test run many times, not a popularity contest run once.
Why Does Gemini Cite A Vertex Endpoint While ChatGPT Cites google.com?
Because each engine runs its own retrieval plumbing, and that plumbing leaves a visible fingerprint in the citations. In Formative Digital's 1,732-citation scan, Gemini routed 384 of its citations through vertexaisearch.cloud.google.com. That is not a content site. It is a Vertex AI Search retrieval endpoint, Google's grounding service, showing up by name in the answer. You are looking at the retrieval step itself, exposed in the citation list.
ChatGPT left a different print. It leaned on google.com 130 times, pulling from Google Maps and the Knowledge Graph. Claude leaned on threebestrated.ca 116 times, trusting a curated Canadian directory layer. Perplexity spread thinner, across homestars.com, opencare.com, and bbb.org. Four engines, four habits, four different doors into the web.
The grounding fingerprint, engine by engine
- Gemini, 384 citations through Vertex AI Search: nearly everything wrapped through Google's grounding endpoint, the clearest proof that retrieval is a distinct, observable step.
- ChatGPT, 130 citations through google.com: heavy reliance on Maps and the Knowledge Graph, so your Google Business Profile and entity facts carry weight here.
- Claude, 116 citations through threebestrated.ca: a curated-directory habit, where being on the right vetted list matters more than raw site authority.
- Perplexity, top source homestars.com at 17: a spread across review directories and trade listings, less concentrated than the others.
The practical reading is simple. Optimising "for AI" as one target misses the point. The engines do not read the same web, so the work is plural by definition. We unpack the per-engine source habits further in our breakdown of the consensus gap between AI engines.
Why Do Four Engines Name Four Different Businesses For One Question?
Four engines name four different businesses because they retrieved four different sets of pages before they wrote a word. When the retrieved sources differ, the names lifted from them differ. The same question, asked the same day, produces non-overlapping shortlists. Three worked examples from the Formative Digital scan, run through DataForSEO across the four engines, make the abstraction concrete.
| Query | ChatGPT named | Claude named | Gemini named | Perplexity named |
|---|---|---|---|---|
| Best dentists, Toronto | The Richmond Dental Centre | Yorkville Smiles | Opencare listing | Opencare listing |
| Best roofers, Hamilton | Silva's Roofing & Siding | ThreeBestRated list | gaf.ca | HomeStars listing |
| Best HVAC, Mississauga | MH Heating and Cooling | UrbanTasker list | bbb.org | UrbanTasker list |
Look down the Hamilton roofers row. ChatGPT named an individual contractor it found through Google. Claude surfaced a ThreeBestRated round-up. Gemini grounded on a manufacturer's contractor finder. Perplexity led with a HomeStars page. Four engines, four front-runners, one query. If you only checked ChatGPT and saw a competitor named, you would have no idea Claude was reading a different page that might have named you, or no one. There is no single answer to "who does AI recommend," because there is no single AI.
Is Being Retrieved The Same As Ranking Number One On Google?
No. Being retrieved and ranking number one overlap, but they are not the same event, and treating them as identical is how good rankings still lose AI citations. Semrush studied 200,000 AI Overviews and found the overlap between cited links and the top 10 organic results was only around 20 to 26%. In their words, "there's no guarantee that ranking a URL in the top 10 organic results will mean the URL will also be in the AI Overview for that same keyword." Roughly three quarters of cited sources were not the top-ten organic pages.
So ranking helps your odds of retrieval without settling them. A strong organic position makes your page a likely candidate. Retrieval then re-decides, weighting entity clarity, structured data, and trusted-directory presence in ways classic ranking does not. A page can rank first and still be passed over by an engine that pulled a directory listing instead. We map that gap in our guide to earning a place in Google's AI Overviews.
Academic work points the same direction. The Generative Engine Optimization paper by Pranjal Aggarwal, Ameet Deshpande and co-authors (arXiv:2311.09735) found that targeted methods can raise a source's visibility in generative answers by up to 40%. What gets retrieved and surfaced is optimisable, not fixed, which is good news for any business losing AI answers despite decent rankings.
What Signals Make A Local Business Retrievable, Not Just Rankable?
The signals that make a local business retrievable are entity clarity, consistency, structured data, and presence on the directories each engine pulls. Retrieval is a machine reading you, so the work is making it confident about who you are and where you operate. Four signals carry most of the weight.
What retrieval actually rewards
- Entity clarity. One unambiguous business identity the engine can resolve. A clean name, a defined category, a stable description. This is the Anchor vector, and it is the foundation the rest sits on. See our guide to anchoring your business entity in the Knowledge Graph.
- Consistency. The same name, address, and phone number everywhere the engine might read them. Conflicting listings make an engine uncertain which entity it is looking at, and uncertain entities get dropped from the retrieved set.
- LocalBusiness schema. Schema.org markup on your site that states your facts in a machine-readable form. The engine should not have to parse them out of paragraphs when you can hand them over directly.
- Trusted-directory presence. Listings on the platforms each engine actually pulls, such as HomeStars, ThreeBestRated, and Opencare. The Formative Digital data shows these directories are where Claude and Perplexity do much of their reading.
Note what is absent from that list: keyword density, backlink-volume tactics, and word count. Retrieval is not the old ranking game with a new name. It is a clarity test. The directories matter so much that we wrote a separate field guide on the local signals that decide who AI names for "near me" searches, because for many trades the directory listing is the page that gets pulled, not the business website at all.
Where On The Page Does A Grounded Engine Actually Read?
A grounded engine reads the opening of a page hardest, so the fact that names you has to sit early. Once retrieval has pulled your page, the engine still has to find the answer on it, and it does not weight every paragraph equally. Kevin Indig's early-2026 Growth Memo analysis, published through Search Engine Land and drawn from 18,012 verified citations out of roughly 30 million, found that 44.2% of citations come from the first 30% of content. Another 31.1% come from the middle, and 24.7% come from the final third.
Read that as an instruction. If the sentence that names your business or states your service area lives in a closing paragraph, you have buried it where fewer than a quarter of citations are drawn. The opening third is where retrieval pays out. State who you are, what you do, and where you do it near the top, in plain extractable sentences, before any storytelling.
Survive the read: the front-loading rule in practice
The Quick Answer at the top of this article is the front-loading rule applied to itself: the thesis and the key number sit in the first 50 words because that is the slice an engine reads hardest. For a local business page, the equivalent is a clear opening that names the business, defines the category, and states the city, all in extractable sentences, before any "since 1998 our family has" narrative. Retrieval pulls the page; the opening third decides whether the answer is found.
How Do You Get Named Once The Engine Has Retrieved You?
You get named by earning retrieval first, then surviving the read, and the order matters. You cannot train your way into an answer, because training is not where the answer is decided. You earn retrieval through entity clarity, LocalBusiness schema, and citation-ready presence on the directories an engine pulls. Then you survive the read by putting the answer in the first 30% of the page. Two stages, both controllable, neither magic.
Formative Digital frames this as a sequence of vectors, and two of them carry the work. Vector 2, Anchor, fixes your entity so retrieval can resolve you with confidence. Vector 5, Cite, earns your presence on the third-party sources engines actually pull. The first-party data in this article is how we run Vector 1, Diagnose: we measure which engines retrieve a business today, on which sources, before recommending a single change. You can read how we diagnose AI visibility as a standalone method.
If you remember one thing, make it the order of operations. Retrieval before generation. Earn the fetch, then survive the read. Everything an AI engine names a business for happens inside that loop, and both halves of it are work you can do.
Frequently Asked Questions
What is retrieval-augmented generation in simple terms?
Retrieval-augmented generation is when an AI engine searches the live web for a handful of pages related to your question, then writes its answer using only what it just read. The retrieval step pulls the sources; the generation step phrases the reply. It is the difference between an engine answering from memory and an engine answering from a quick read of the open web.
Does RAG use training data or live web sources?
RAG answers from live web sources fetched at the moment you ask. Amazon Web Services defines it as a process that references an authoritative knowledge base outside of the model's training data before generating a response. Training data shapes how the engine writes and reasons, but the businesses it names come from the pages it retrieves that minute, which is why a brand created last month can be cited today.
How does an AI engine decide which sources to retrieve?
The engine turns your question into a search, ranks candidate pages by how well they match the query and the entity, and pulls a small set, often four to eight. It favours pages where the business is named clearly, described with structured data, and listed on directories the engine already trusts. Each engine runs its own retrieval, so the shortlist differs by engine.
Why is my business named by one AI engine but not another?
Because each engine retrieves from a different slice of the web. In Formative Digital's May 2026 scan of 1,732 citations, 83.7% of cited sources appeared in only one of the four engines. Gemini routed 384 citations through a Vertex AI Search endpoint, ChatGPT leaned on google.com, and Claude leaned on threebestrated.ca. Different retrieval means different name lists.
Is RAG the same as ranking number one on Google?
No. Retrieval and classic ranking overlap, but they are not the same. Semrush studied 200,000 AI Overviews and found the overlap between cited links and the top 10 organic results was only about 20 to 26%. Ranking first helps your odds of being retrieved, but it does not guarantee an engine pulls and reads your page.
What signals make a local business retrievable, not just rankable?
Clear entity identity, consistent name, address, and phone across the web, LocalBusiness schema on your site, and presence on the directories an engine actually pulls, such as HomeStars, ThreeBestRated, and Opencare. Retrieval rewards machine-readable clarity about who you are and where you operate, not keyword stuffing.
How do I get my business retrieved and cited by AI?
Earn retrieval first, then survive the read. Make your entity unambiguous with structured data and consistent listings, get cited on the directories each engine trusts, and put the answer to the question high on the page. Kevin Indig's citation study found 44.2% of AI citations come from the first 30% of content, so the fact that names you must sit early, not in a closing paragraph.
Sources
- Amazon Web Services. What is RAG? Retrieval-Augmented Generation AI Explained. Amazon Web Services
- Google Cloud. What is Retrieval-Augmented Generation (RAG)? Google Cloud
- Semrush. We Studied 200,000 AI Overviews: Here's What We Learned. Overlap with top-10 organic results around 20 to 26%. Semrush
- Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K., & Deshpande, A. (2023). GEO: Generative Engine Optimization. arXiv preprint. arXiv:2311.09735
- Indig, K., Growth Memo (via Search Engine Land). (2026). 44% of ChatGPT citations come from the first third of content: Study. 18,012 verified citations. Search Engine Land
Get Your Free AI Visibility Audit
Formative Digital, Brantford, Ontario
The audit runs your business through ChatGPT, Claude, Gemini, and Perplexity, captures which engines retrieve you and on which sources, and reports where you are named and where you are missing. You get the report whether you engage further or not.