AI doesn't lie. It just doesn't know that it doesn't know. And that's much more dangerous.
What is hallucination, really?
AI hallucination happens when a language model (LLM) confidently states something that isn't true. It doesn't throw an error or a warning — it simply invents an answer that is grammatically perfect, stylistically convincing and factually wrong.
Real-world examples:
- A lawyer used ChatGPT to draft a court filing → cited 6 case precedents that don't exist (Mata v. Avianca, 2023)
- A healthcare chatbot confidently recommended an incorrect drug dosage
- An enterprise AI assistant cited a company policy that never existed
Hallucination is not a bug — it's a natural consequence of how the model works. If you don't understand why it happens, you won't be able to manage it.
Why do models hallucinate?
Language models don't "know" — they predict
An LLM is not a knowledge database. It's a probabilistic text continuation engine: given a context, it generates the most likely next token. If you ask "Who wrote War and Peace?", it doesn't look at a list — it reconstructs from statistical patterns what the most likely answer to such a question would be.
If the data in the question appeared often in training → correct answer. If it appeared rarely or never → the model fills the gap with plausible but false data.
The four main causes
Missing or conflicting training data The model never encountered the question — or saw conflicting information (e.g. a book referenced with two different authors). It still answers — that's its job.
Temperature too high
The temperature parameter controls how far the output deviates from the most likely token. High value → more creative, but more prone to hallucination.
Long context degradation The model pays less attention to the start of a 100K+ token context. The "lost in the middle" effect: long documents are often misread in the middle.
Prompt ambiguity If the question isn't clear, the model "picks" an interpretation. If it picks the wrong one → confident but irrelevant answer.
The "illusion of knowledge"
The most dangerous part: the LLM doesn't know that it doesn't know. It has no internal confidence meter for factuality. So:
- ❌ It doesn't say "I don't know" on its own (unless trained to)
- ❌ It doesn't signal uncertainty
- ❌ It doesn't distinguish memorized facts from invented composites
This is a technical limit — not bad intent.
The 5 types of hallucination
Not every hallucination is the same. So the mitigation isn't either.
| Type | What happens? | Example | How to handle |
|---|---|---|---|
| Factual | A specific fact is wrong | "Budapest is the capital of Poland" | RAG, validation |
| Source fabrication | Invented citation | Non-existent book / case | Mandatory source links |
| Logical | Wrong inference from correct data | Math error | Chain-of-thought, calculator tool |
| Instruction | Doesn't do what you asked | "Return JSON only" → starts with prose | Structured output, Zod / Pydantic |
| Context drift | Misquotes earlier conversation | "As you said, X..." (you didn't) | Shorter context, summary |
Practical mitigation techniques
RAG — Retrieval Augmented Generation
The most common and most effective mitigation: don't let the model "remember" — give it sources.
How it works:
User: "What does the 2024 leave policy say about home office?"
↓
Vector DB query (based on embedding of the question)
↓
Top-5 relevant document chunks returned
↓
Prompt: "Based on these documents, answer: [chunks] ... Question: ..."
↓
LLM answer with source citation
Code example (simplified):
async function ragQuery(question: string) {
// 1. Embed the question
const queryEmbedding = await openai.embeddings.create({
model: "text-embedding-3-small",
input: question
});
// 2. Find relevant chunks
const relevantDocs = await vectorDb.search({
embedding: queryEmbedding.data[0].embedding,
topK: 5,
minScore: 0.75 // low score → don't even answer
});
// 3. If nothing relevant → don't hallucinate
if (relevantDocs.length === 0) {
return "I couldn't find an answer in the documentation.";
}
// 4. Structured prompt with sources
const context = relevantDocs
.map((d, i) => `[Source ${i+1}: ${d.source}]\n${d.content}`)
.join("\n\n");
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{
role: "system",
content: `Answer only based on the provided sources.
If the sources don't contain the answer, say: "I have no information on this".
After every claim, cite the source as [Source N].`
}, {
role: "user",
content: `Sources:\n${context}\n\nQuestion: ${question}`
}],
temperature: 0.1
});
return response.choices[0].message.content;
}
Best practices:
- Minimum score threshold: if top-1 relevance is below 0.75, don't answer
- Mandatory source citation in the system prompt
- Chunk size: 200-500 tokens works best (not too short, not too long)
- Hybrid search: vector + keyword combined (BM25 + cosine)
Structured output — force the model into a shape
When the model has to return a concrete structure, it hallucinates much less.
Example with Zod + OpenAI structured output:
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";
const InvoiceSchema = z.object({
invoiceNumber: z.string(),
totalAmount: z.number(),
currency: z.enum(["HUF", "EUR", "USD"]),
items: z.array(z.object({
description: z.string(),
quantity: z.number(),
unitPrice: z.number()
})),
// Critical: signal uncertainty
confidence: z.enum(["high", "medium", "low"]),
uncertainFields: z.array(z.string()).optional()
});
const response = await openai.chat.completions.parse({
model: "gpt-4o",
messages: [
{ role: "system", content: "Extract the invoice data. Only what you can clearly see." },
{ role: "user", content: invoiceText }
],
response_format: zodResponseFormat(InvoiceSchema, "invoice")
});
The confidence and uncertainFields fields force the model to acknowledge uncertainty. If confidence === "low" → manual review.
Chain-of-Thought and self-critique
Chain-of-Thought (CoT): ask the model to think step by step.
Bad prompt: "How many apples are left if you sell 3 of 10 twice?"
Good prompt: "Think step by step. 1) How many apples to start?
2) How many after the first sale? 3) How many after the second?"
Self-critique: 2-step generation:
// Step 1: answer
const answer = await generate(question);
// Step 2: self-critique
const critique = await generate(`
The following is the answer to the question:
Question: ${question}
Answer: ${answer}
Examine it critically:
1. Are there any unsupported claims?
2. Are there logical errors?
3. Are there fabricated facts or citations?
Return a corrected answer with only verified information.
`);
Self-critique can cut hallucination by 30-50% — at the cost of 2x token spend.
Tool use — calculator, search, database
The model is bad at math, bad at dates, bad at real-time data. Give it tools.
const tools = [
{
type: "function",
function: {
name: "calculate",
description: "Evaluate a math expression",
parameters: { /* ... */ }
}
},
{
type: "function",
function: {
name: "search_database",
description: "Database search for customer data",
parameters: { /* ... */ }
}
},
{
type: "function",
function: {
name: "web_search",
description: "Search for up-to-date information",
parameters: { /* ... */ }
}
}
];
The model issues a tool_call → you execute it → result returns to the model. The data is real, the model only interprets.
Temperature and sampling
For factual tasks:
{
temperature: 0.1, // low creativity
top_p: 0.95, // narrow probability mass
presence_penalty: 0,
frequency_penalty: 0
}
For creative tasks (marketing copy, brainstorm):
{
temperature: 0.8,
top_p: 0.95
}
Never use high temperature for factual answers.
Detection — how do you spot a hallucination?
Automated validation
Source check: if the model cites a source, automatically verify it exists:
async function validateCitations(answer: string, sources: Source[]) {
const citationPattern = /\[Source (\d+)\]/g;
const citations = [...answer.matchAll(citationPattern)];
for (const match of citations) {
const sourceIndex = parseInt(match[1]) - 1;
if (sourceIndex >= sources.length) {
throw new Error(`Hallucinated citation: ${match[0]}`);
}
}
}
Schema validation: if you expect JSON, validate it:
try {
const parsed = InvoiceSchema.parse(JSON.parse(response));
} catch (e) {
// Hallucinated / invalid structure
retry();
}
Confidence measurement
Logprobs: the logprobs parameter returns how confident the model was at each token.
const response = await openai.chat.completions.create({
// ...
logprobs: true,
top_logprobs: 5
});
const avgLogprob = response.choices[0].logprobs.content
.reduce((sum, t) => sum + t.logprob, 0) / response.choices[0].logprobs.content.length;
if (avgLogprob < -1.5) {
// Low confidence → manual review or re-ask
}
LLM-as-a-judge
Have another LLM (or the same one in a separate call) judge the answer:
Prompt: "For the question-answer pair below:
- Is the answer factually correct? (1-5)
- Does it contain unsupported claims? (yes/no)
- Are there contradictions? (yes/no)
Respond as JSON."
Not perfect (the judge can hallucinate too), but it catches a lot.
Production checklist
Before shipping the AI feature, check:
- Is RAG in place where factual answers are required?
- Is a minimum relevance score configured?
- Did you teach the system prompt to say "I don't know"?
- Is source citation mandatory?
- Is structured output used where structure matters?
- Is temperature low (0.0-0.3) for factual cases?
- Is tool use used where math, dates, real data matter?
- Is validation running on the output (schema, citation, business logic)?
- Monitoring: are low-confidence cases logged?
- Human-in-the-loop on critical decisions (medicine, law, finance)?
- Disclaimer: does the user know it was AI?
Business risk management
Beyond the technical mitigation, business decisions matter too:
Risk zones
| Use case | Risk | Strategy |
|---|---|---|
| Marketing copy generation | Low | LLM autonomous, human review before publish |
| Internal customer-info chatbot | Medium | RAG + source citation + "uncertain → handoff to human" |
| Legal / medical advice | High | Only with a human expert, never autonomous |
| Financial transaction decision | Critical | AI suggests, human decides, audit log |
The "90% accuracy" trap
If the AI is right 90% of the time, that can be excellent — or catastrophic. For a customer-service chatbot, 10% errors are tolerable. For a drug dosage suggestion, never.
The question is: what is the cost of an error?
- If low → you can grant autonomy
- If high → human-in-the-loop is mandatory
Summary: 7 takeaways
- Hallucination is not a bug — it's a natural consequence of the architecture. The model predicts, it doesn't know.
- 5 types: factual, source fabrication, logical, instruction, context drift. Each needs different mitigation.
- RAG is the most effective — give the model sources, don't let it remember. Minimum score, hybrid search, mandatory citation.
- Structured output — when forced into a shape, the model hallucinates less. Zod / Pydantic schema, confidence field.
- Tool use for math, dates, real data. Never let the model do arithmetic on its own.
- Temperature 0.1-0.3 for factual cases. Creativity and factuality are opposites.
- Human-in-the-loop for critical decisions. The 90% accuracy trap — the cost of the 10% decides.
Hallucination can't be fully eliminated — but it can be reduced to 1-2% with the right architecture. The difference between "an AI feature demo" and "an enterprise-ready AI system" is not the model, it's the validation layer built around it.
The model is a creative child. You are the responsible adult next to it.
Building a hallucination-resistant AI system?
In a 60-minute consultation we review your use case and risk level, and outline a RAG + validation architecture that's defensible in your context.
Request a consultation