AI hallucination mitigation — why it happens and how to handle it

AI doesn't lie. It just doesn't know that it doesn't know. And that's much more dangerous.

What is hallucination, really?

AI hallucination happens when a language model (LLM) confidently states something that isn't true. It doesn't throw an error or a warning — it simply invents an answer that is grammatically perfect, stylistically convincing and factually wrong.

Real-world examples:

A lawyer used ChatGPT to draft a court filing → cited 6 case precedents that don't exist (Mata v. Avianca, 2023)
A healthcare chatbot confidently recommended an incorrect drug dosage
An enterprise AI assistant cited a company policy that never existed

Hallucination is not a bug — it's a natural consequence of how the model works. If you don't understand why it happens, you won't be able to manage it.

Why do models hallucinate?

Language models don't "know" — they predict

An LLM is not a knowledge database. It's a probabilistic text continuation engine: given a context, it generates the most likely next token. If you ask "Who wrote War and Peace?", it doesn't look at a list — it reconstructs from statistical patterns what the most likely answer to such a question would be.

If the data in the question appeared often in training → correct answer. If it appeared rarely or never → the model fills the gap with plausible but false data.

The four main causes

Missing or conflicting training data The model never encountered the question — or saw conflicting information (e.g. a book referenced with two different authors). It still answers — that's its job.

Temperature too high The temperature parameter controls how far the output deviates from the most likely token. High value → more creative, but more prone to hallucination.

Long context degradation The model pays less attention to the start of a 100K+ token context. The "lost in the middle" effect: long documents are often misread in the middle.

Prompt ambiguity If the question isn't clear, the model "picks" an interpretation. If it picks the wrong one → confident but irrelevant answer.

The "illusion of knowledge"

The most dangerous part: the LLM doesn't know that it doesn't know. It has no internal confidence meter for factuality. So:

❌ It doesn't say "I don't know" on its own (unless trained to)
❌ It doesn't signal uncertainty
❌ It doesn't distinguish memorized facts from invented composites

This is a technical limit — not bad intent.

The 5 types of hallucination

Not every hallucination is the same. So the mitigation isn't either.

Type	What happens?	Example	How to handle
Factual	A specific fact is wrong	"Budapest is the capital of Poland"	RAG, validation
Source fabrication	Invented citation	Non-existent book / case	Mandatory source links
Logical	Wrong inference from correct data	Math error	Chain-of-thought, calculator tool
Instruction	Doesn't do what you asked	"Return JSON only" → starts with prose	Structured output, Zod / Pydantic
Context drift	Misquotes earlier conversation	"As you said, X..." (you didn't)	Shorter context, summary

Practical mitigation techniques

RAG — Retrieval Augmented Generation

The most common and most effective mitigation: don't let the model "remember" — give it sources.

How it works:

User: "What does the 2024 leave policy say about home office?"
   ↓
Vector DB query (based on embedding of the question)
   ↓
Top-5 relevant document chunks returned
   ↓
Prompt: "Based on these documents, answer: [chunks] ... Question: ..."
   ↓
LLM answer with source citation

Code example (simplified):

async function ragQuery(question: string) {
  // 1. Embed the question
  const queryEmbedding = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: question
  });

  // 2. Find relevant chunks
  const relevantDocs = await vectorDb.search({
    embedding: queryEmbedding.data[0].embedding,
    topK: 5,
    minScore: 0.75 // low score → don't even answer
  });

  // 3. If nothing relevant → don't hallucinate
  if (relevantDocs.length === 0) {
    return "I couldn't find an answer in the documentation.";
  }

  // 4. Structured prompt with sources
  const context = relevantDocs
    .map((d, i) => `[Source ${i+1}: ${d.source}]\n${d.content}`)
    .join("\n\n");

  const response = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [{
      role: "system",
      content: `Answer only based on the provided sources.
        If the sources don't contain the answer, say: "I have no information on this".
        After every claim, cite the source as [Source N].`
    }, {
      role: "user",
      content: `Sources:\n${context}\n\nQuestion: ${question}`
    }],
    temperature: 0.1
  });

  return response.choices[0].message.content;
}

Best practices:

Minimum score threshold: if top-1 relevance is below 0.75, don't answer
Mandatory source citation in the system prompt
Chunk size: 200-500 tokens works best (not too short, not too long)
Hybrid search: vector + keyword combined (BM25 + cosine)

Structured output — force the model into a shape

When the model has to return a concrete structure, it hallucinates much less.

Example with Zod + OpenAI structured output:

import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const InvoiceSchema = z.object({
  invoiceNumber: z.string(),
  totalAmount: z.number(),
  currency: z.enum(["HUF", "EUR", "USD"]),
  items: z.array(z.object({
    description: z.string(),
    quantity: z.number(),
    unitPrice: z.number()
  })),
  // Critical: signal uncertainty
  confidence: z.enum(["high", "medium", "low"]),
  uncertainFields: z.array(z.string()).optional()
});

const response = await openai.chat.completions.parse({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "Extract the invoice data. Only what you can clearly see." },
    { role: "user", content: invoiceText }
  ],
  response_format: zodResponseFormat(InvoiceSchema, "invoice")
});

The confidence and uncertainFields fields force the model to acknowledge uncertainty. If confidence === "low" → manual review.

Chain-of-Thought and self-critique

Chain-of-Thought (CoT): ask the model to think step by step.

Bad prompt: "How many apples are left if you sell 3 of 10 twice?"
Good prompt: "Think step by step. 1) How many apples to start?
              2) How many after the first sale? 3) How many after the second?"

Self-critique: 2-step generation:

// Step 1: answer
const answer = await generate(question);

// Step 2: self-critique
const critique = await generate(`
  The following is the answer to the question:
  Question: ${question}
  Answer: ${answer}

  Examine it critically:
  1. Are there any unsupported claims?
  2. Are there logical errors?
  3. Are there fabricated facts or citations?

  Return a corrected answer with only verified information.
`);

Self-critique can cut hallucination by 30-50% — at the cost of 2x token spend.

Tool use — calculator, search, database

The model is bad at math, bad at dates, bad at real-time data. Give it tools.

const tools = [
  {
    type: "function",
    function: {
      name: "calculate",
      description: "Evaluate a math expression",
      parameters: { /* ... */ }
    }
  },
  {
    type: "function",
    function: {
      name: "search_database",
      description: "Database search for customer data",
      parameters: { /* ... */ }
    }
  },
  {
    type: "function",
    function: {
      name: "web_search",
      description: "Search for up-to-date information",
      parameters: { /* ... */ }
    }
  }
];

The model issues a tool_call → you execute it → result returns to the model. The data is real, the model only interprets.

Temperature and sampling

For factual tasks:

{
  temperature: 0.1,    // low creativity
  top_p: 0.95,         // narrow probability mass
  presence_penalty: 0,
  frequency_penalty: 0
}

For creative tasks (marketing copy, brainstorm):

{
  temperature: 0.8,
  top_p: 0.95
}

Never use high temperature for factual answers.

Detection — how do you spot a hallucination?

Automated validation

Source check: if the model cites a source, automatically verify it exists:

async function validateCitations(answer: string, sources: Source[]) {
  const citationPattern = /\[Source (\d+)\]/g;
  const citations = [...answer.matchAll(citationPattern)];

  for (const match of citations) {
    const sourceIndex = parseInt(match[1]) - 1;
    if (sourceIndex >= sources.length) {
      throw new Error(`Hallucinated citation: ${match[0]}`);
    }
  }
}

Schema validation: if you expect JSON, validate it:

try {
  const parsed = InvoiceSchema.parse(JSON.parse(response));
} catch (e) {
  // Hallucinated / invalid structure
  retry();
}

Confidence measurement

Logprobs: the logprobs parameter returns how confident the model was at each token.

const response = await openai.chat.completions.create({
  // ...
  logprobs: true,
  top_logprobs: 5
});

const avgLogprob = response.choices[0].logprobs.content
  .reduce((sum, t) => sum + t.logprob, 0) / response.choices[0].logprobs.content.length;

if (avgLogprob < -1.5) {
  // Low confidence → manual review or re-ask
}

LLM-as-a-judge

Have another LLM (or the same one in a separate call) judge the answer:

Prompt: "For the question-answer pair below:
        - Is the answer factually correct? (1-5)
        - Does it contain unsupported claims? (yes/no)
        - Are there contradictions? (yes/no)
        Respond as JSON."

Not perfect (the judge can hallucinate too), but it catches a lot.

Production checklist

Before shipping the AI feature, check:

Is RAG in place where factual answers are required?
Is a minimum relevance score configured?
Did you teach the system prompt to say "I don't know"?
Is source citation mandatory?
Is structured output used where structure matters?
Is temperature low (0.0-0.3) for factual cases?
Is tool use used where math, dates, real data matter?
Is validation running on the output (schema, citation, business logic)?
Monitoring: are low-confidence cases logged?
Human-in-the-loop on critical decisions (medicine, law, finance)?
Disclaimer: does the user know it was AI?

Business risk management

Beyond the technical mitigation, business decisions matter too:

Risk zones

Use case	Risk	Strategy
Marketing copy generation	Low	LLM autonomous, human review before publish
Internal customer-info chatbot	Medium	RAG + source citation + "uncertain → handoff to human"
Legal / medical advice	High	Only with a human expert, never autonomous
Financial transaction decision	Critical	AI suggests, human decides, audit log

The "90% accuracy" trap

If the AI is right 90% of the time, that can be excellent — or catastrophic. For a customer-service chatbot, 10% errors are tolerable. For a drug dosage suggestion, never.

The question is: what is the cost of an error?

If low → you can grant autonomy
If high → human-in-the-loop is mandatory

Summary: 7 takeaways

Hallucination is not a bug — it's a natural consequence of the architecture. The model predicts, it doesn't know.
5 types: factual, source fabrication, logical, instruction, context drift. Each needs different mitigation.
RAG is the most effective — give the model sources, don't let it remember. Minimum score, hybrid search, mandatory citation.
Structured output — when forced into a shape, the model hallucinates less. Zod / Pydantic schema, confidence field.
Tool use for math, dates, real data. Never let the model do arithmetic on its own.
Temperature 0.1-0.3 for factual cases. Creativity and factuality are opposites.
Human-in-the-loop for critical decisions. The 90% accuracy trap — the cost of the 10% decides.

Hallucination can't be fully eliminated — but it can be reduced to 1-2% with the right architecture. The difference between "an AI feature demo" and "an enterprise-ready AI system" is not the model, it's the validation layer built around it.

The model is a creative child. You are the responsible adult next to it.

Building a hallucination-resistant AI system?

In a 60-minute consultation we review your use case and risk level, and outline a RAG + validation architecture that's defensible in your context.

Request a consultation