How Do Generative AI Responses Work?

Written on 30/1/2026
Modified on 3/2/2026
3min
AI Output

Key points of the article

  • LLMs operate in two modes: knowledge base (training data) or real-time web search (RAG)
  • AI decides to search the web when it detects uncertainty or needs fresh data
  • Query fan-out breaks down each question into multiple sub-queries to explore the full semantic field
  • ChatGPT uses ~25 sources per response vs ~10 for Perplexity, with only 6% of sources in common
  • Sources come from top 10-30 SERPs: SEO remains the foundation of AEO
  • AI favors well-structured, recent, and easily extractable content
  • Structured data helps LLMs understand the context of your content

In 2026, a growing number of internet users no longer click on links. They ask their questions directly to ChatGPT, Perplexity, or Gemini and expect a synthesized answer.

But how do these AIs build their responses? Where does the information they cite come from?

To optimize your visibility in AI engines (what we call AEO — Answer Engine Optimization), you first need to understand how they work. That's what we'll break down in this article.

The Two Response Modes of LLMs

When you ask an AI a question, it can respond in two very different ways. Understanding this distinction is fundamental.

Mode 1: Response from the Knowledge Base

In this mode, the AI responds solely from what it learned during its training. It doesn't search for information on the Internet.

How does it actually work?

The model was trained on billions of texts: articles, books, websites, forums... During this training, it learned statistical relationships between words and concepts. When you ask it a question, it generates a response by predicting the most likely words in that context.

The LLM doesn't "know" in the human sense. It has learned patterns and generates coherent text based on those patterns.

The limitations of this mode:

  • Knowledge cutoff: knowledge stops at the training date
  • Hallucinations: if information is missing, the AI may invent a convincing but false answer
  • No verification: no external source to validate claims

Mode 2: Response with Web Search (RAG)

In this mode, the AI combines its internal knowledge with external sources retrieved in real-time from the web. This is called RAG (Retrieval-Augmented Generation).

This is the mode that matters for AEO. When the AI cites sources, it means it has activated web search.

GPT Web search

When Does AI Decide to Search the Web?

AI doesn't systematically launch a search. It first evaluates its level of uncertainty about the question asked.

Signals that trigger a web search:

  • Queries that require fresh data (news, prices, recent events)
  • Questions that demand verifiable information (figures, statistics, precise facts)
  • Searches related to e-commerce (product comparisons, reviews, prices)
  • Topics where the model detects uncertainty in its knowledge

Conversely, for general knowledge questions or well-established concepts, AI often responds directly from its knowledge base.

The Complete Process of a Web Search Response

When AI decides to search the web, it follows a multi-step process. This is where it gets interesting for understanding how to appear in its responses.

Schema of how AI generates a response

Step 1: Prompt Analysis

The AI analyzes your question to understand the search intent. It identifies:

  • The main topic
  • Implicit subtopics
  • The expected level of detail
  • Context (location, language, conversation history)

Step 2: Query Fan-Out (Decomposition into Sub-Queries)

This is the key step. The AI doesn't launch a single search matching your question. It breaks down your prompt into multiple sub-queries to explore different angles of the topic.

Concrete example:

You ask: "What's the best SEO strategy for an e-commerce site in 2026?"

The AI generates multiple sub-queries:

  • "e-commerce SEO strategy 2026"
  • "SEO trends 2026"
  • "product page SEO optimization"
  • "technical SEO e-commerce"
  • "link building e-commerce"
  • "Core Web Vitals SEO impact"

This mechanism is called query fan-out. It allows the AI to cover the entire semantic field of your question.

Key figures:

  • ChatGPT launches an average of 2.5 searches per prompt
  • ChatGPT uses an average of 25 sources per response
  • Perplexity (free version) launches 1 search and uses about 10 sources
  • ChatGPT sometimes formulates its queries in English, even for questions in other languages

Step 3: Search and Source Retrieval

The AI sends its sub-queries to search engines. Each LLM uses different sources:

  • ChatGPT: mix of Google + Bing
  • Perplexity: primarily Google
  • Gemini: Google Search
  • Copilot: Bing

Retrieved results generally come from the top 10 to top 30 SERPs. The AI doesn't go looking on page 5 of Google.

Important point for AEO: to appear in AI responses, you first need to be well-ranked in SEO for the sub-queries generated by query fan-out.

Step 4: Source Evaluation and Selection

The AI doesn't treat all sources equally. It evaluates:

  • Relevance to the question
  • Perceived reliability of the site (authority, E-E-A-T)
  • Content freshness (publication/update date)
  • Content structure (ease of information extraction)

Well-structured content with clear headings, direct answers, and factual data is prioritized. The AI needs to be able to easily extract information.

Step 5: Synthesis and Response Generation

The AI compiles information from different sources to generate a single, coherent response. It:

  • Cross-references information to identify consensus
  • Reformulates in a fluid, natural style
  • Adds citations to sources used (depending on the LLM)
  • Structures the response to be immediately usable

Diagram: From Prompt to Response

1

User Prompt

"What's the best SEO strategy for an e-commerce site in 2026?"

2

Prompt Analysis

The AI identifies the intent (informational), context (e-commerce, SEO) and evaluates its uncertainty level to decide if a web search is needed.

3

Query Fan-Out

e-commerce SEO strategy SEO trends 2026 technical SEO e-commerce product page optimization link building e-commerce Core Web Vitals impact
4

Web Search

G B Top 10-30 results per query (~25 sources)
5

Source Evaluation

Relevance, reliability (E-E-A-T), content freshness, and ease of information extraction.

6

Synthesis & Aggregation

Cross-referencing information, identifying consensus, and coherent reformulation.

Generated Response

Structured response with citations to sources used.

source-1.com source-2.com source-3.com +22 sources

Differences Between LLMs

Not all LLMs work the same way. Here's what distinguishes the main ones.

ChatGPT (OpenAI)

ChatGPT offers two modes:

  • Without web search: responds from its knowledge base (knowledge cutoff)
  • With web search: activates browsing automatically or manually

When search is activated, ChatGPT is the most "source-hungry": 25 sources on average per response, with 2.5 searches per prompt.

Distinctive feature: ChatGPT often formulates part of its queries in English, even for questions in other languages.

ChatGPT Interface

Perplexity

Perplexity is always connected to the web. That's its purpose: an answer engine, not a chatbot.

It primarily uses Google as a source and systematically displays citations with links to sources. The free version launches about 1 search per prompt and uses about 10 sources.

Perplexity formulates queries very close to the original prompt, unlike ChatGPT which explores more broadly.

Perplexity Interface

Google AI Overviews / AI Mode

Google AI Overviews (formerly SGE) and AI Mode are integrated directly into Google search results.

The advantage: they rely on the complete Google index. Cited sources are generally those appearing in the classic top 10 SERPs.

This is the LLM where traditional SEO has the most direct impact on AI visibility.

AI Mode / AI Overview Google Interface

Gemini

Gemini (Google) works similarly to AI Overviews but in a conversational interface. It uses Google Search for its searches and can access real-time data.

Gemini Interface

Claude (Anthropic)

Claude primarily operates without web search in its standard version. It responds from its knowledge base with a knowledge cutoff date.

Some integrations (like Claude in third-party tools) can add search capabilities, but it's not native.

Claude Interface

What This Means for Your Visibility

Now that we understand how it works, what does this concretely mean for appearing in AI responses?

SEO Remains the Foundation

To appear in AI sources, you first need to be well-ranked on Google and Bing. LLMs pull their sources from the top 10-30 SERPs.

No shortcuts: AEO starts with good SEO.

Cover the Complete Semantic Field

With query fan-out, AI doesn't search for your exact keyword. It explores the entire semantic field around the question.

Consequence: your content must cover a topic comprehensively, not just answer a specific query. Topic clusters and pillar content make complete sense here.

Make Information Extraction Easy

AI needs to be able to easily extract information from your content. For this:

  • Clear headings (H2, H3) that clearly announce the content
  • Direct answers to questions in the first paragraphs
  • Lists and tables for factual information
  • Short paragraphs (2-4 sentences max per idea)

AI-friendly content is above all human-friendly content. If a reader can scan your article and quickly find the information they're looking for, so can AI.

Structured Data

Structured data (Schema.org) helps AI understand the nature and context of your content. It allows clear identification of:

  • Who is the author (credibility, E-E-A-T)
  • What entity is described (company, product, person)
  • What are the questions/answers (FAQPage)
  • What is the publication/update date

👉 We've written a complete guide on this topic: What Structured Data to Implement for SEO and AEO?

Content Freshness

AI favors recent and updated content. A 2022 article will have less chance of being cited than content updated in 2026.

Remember to update your key content and clearly indicate the last modification date.

Our Perspective at Vydera

At Vydera, we've been helping our clients with AEO for several months. Here's what we observe in the field.

AEO is not a revolution, it's an evolution.

The fundamentals remain the same: create quality content, well-structured, that truly answers users' questions. What changes is the importance of certain signals that were sometimes overlooked:

  • Complete semantic coverage of a topic (not just a keyword)
  • Editorial structure that facilitates extraction
  • Structured data that makes context explicit
  • Freshness and regular updates

What surprises us:

Only 6% of sources are shared between ChatGPT and Perplexity for the same prompt. This means that to be visible on both, you potentially need to double your efforts. Each LLM has its own selection criteria.

Our advice:

Don't start from scratch. Begin by analyzing how your brand appears (or doesn't) in AI responses for your key queries. Identify the sources cited by LLMs in your niche. Then optimize your existing content before creating new ones.

Conclusion

Generative AI doesn't "search" for an answer like a traditional search engine. It builds one by combining its internal knowledge and web sources retrieved in real-time.

The query fan-out mechanism is central: AI breaks down your question into sub-queries to explore the entire semantic field. To appear in its responses, you need to:

  1. Rank well in SEO for potential sub-queries
  2. Cover the topic comprehensively (topic clusters)
  3. Structure content to facilitate information extraction
  4. Implement structured data to make context explicit
  5. Maintain content freshness

AEO starts with good SEO. But it requires a more comprehensive approach, focused on semantic coverage and structural quality of content.

Want to know how your brand appears in AI responses? Or implement an AEO strategy tailored to your industry?

Contact us →Contact us →

Without web search, AI responds solely from its training data with a fixed knowledge cutoff date and risk of hallucination. With web search (RAG mode), it queries Google or Bing in real-time to retrieve up-to-date sources and generate a more reliable, well-sourced response.

Query fan-out is the mechanism by which AI breaks down your question into multiple sub-queries before launching its searches. For example, a question about e-commerce SEO strategy generates searches on trends, technical SEO, link building, etc. This allows covering the entire topic comprehensively.

Each LLM uses different search engines (ChatGPT: Google + Bing / Perplexity: Google), formulates its queries differently, and selects sources according to its own criteria. Result: only 6% of sources are shared between the two for the same prompt.

First, you need to rank well in SEO because LLMs pull their sources from the top 10-30 Google/Bing results. Then, structure your content for easy extraction (clear headings, direct answers, short paragraphs) and implement structured data to provide explicit context.

Yes, SEO remains fundamental for AEO. AI doesn't create its own sources: it retrieves content that already ranks well on search engines. Without SEO visibility on the sub-queries generated by query fan-out, your content won't be included in LLM responses.

Thibaut Legrand
Thibaut Legrand
Co-founder - Vydera