AI Agent Communication and Memory — Handoff, Shared State, and the 3-Level Memory Model

This article is Part 4 of our comprehensive study AI Agent Systems in Enterprise Practice — the full whitepaper presents the world of autonomous and multi-agent systems across 14 chapters.

Why Is Communication Critical?

The strength of a multi-agent system doesn't lie in the individual agents, but in how they work together. The communication pattern determines the system's reliability, cost, and scalability.

Communication Patterns

Handoff (Transfer)

One agent transfers the entire conversation to another. The OpenAI Agents SDK supports this natively.

Sales Agent: "This is an invoicing topic — handing off to the Finance Agent."
    → [handoff, with full context]
Finance Agent: "Let me check the invoice details..."

Advantage: Simple, clear transfer of responsibility. Disadvantage: The receiving agent gets the full context → token cost, context noise.

When to use: Domain switch during conversation — e.g., a sales question becomes an invoicing question.

Delegation + Result (Tool Call)

The orchestrator calls a specialized agent as a "tool" — receives the result, and processes it itself.

Orchestrator: "Query the Q1 statistics."
    → Analytics Agent → { revenue: 12M, deals_won: 45 }
Orchestrator: [processes, formats, responds]

Advantage: The orchestrator controls the output; the specialized agent works in a focused manner. Disadvantage: Extra LLM call for the orchestrator to process the result.

When to use: Data retrieval, computation, background processing.

Shared State (Common Workspace)

Agents read and write to a shared data structure — communication happens through changes in shared state.

{
  "customer": { "name": "Kovács Ltd.", "id": 123 },
  "order": null,       // ← Logistics Agent fills in
  "invoice": null,     // ← Finance Agent fills in
  "email_draft": null  // ← Communication Agent fills in
}

Advantage: No direct agent-to-agent communication → simpler, fewer errors. Disadvantage: Potential for race conditions, more complex debugging.

When to use: Complex, multi-step tasks where multiple agents build on each other's work.

Broadcast (Fan-out)

Parallel task distribution to multiple agents, with result merging.

Advantage: Fast parallel processing. Disadvantage: Expensive (multiple concurrent LLM calls), merging is non-trivial.

When to use: Independent subtasks — e.g., "Analyze Q1 by region" → region agents run in parallel.

The Three Levels of Memory

Memory is what transforms an AI agent from a mere answer machine into an intelligent partner. We distinguish three levels:

Working Memory

The current conversation: message exchanges, tool-call results, intermediate decisions.

Lifespan: Duration of the conversation
Size: ~4K–32K tokens (depends on model context window)
Technique: Automatic — the message history itself

Short-term Memory

Session-level context that survives beyond the conversation, but not forever.

Lifespan: Days to weeks
Size: Summaries, extracts
Technique: Database-stored summaries, automatic summarization

Example: "Yesterday we discussed the Kovács deal, we were at the point where..."

Long-term Memory

Preferences, business context, past lessons learned. This is what makes the agent a "veteran employee" of the company.

Lifespan: Months to years
Size: Ever-growing
Technique: Knowledge Graph + vector search (embedding)

Example: The agent knows that Kovács Ltd. always orders in Q4, Márta is the contact, and they prefer email over phone calls.

Memory type	Orchestrator	Specialized agent	Notes
User message	Yes	Yes (filtered)	Orchestrator can filter to the relevant part
Tool-call results	Yes	Only its own	Isolated context → less noise
Knowledge Graph	Yes	Its own domain	E.g. Sales → deals, Support → tickets
Preferences	Yes	Receives relevant part	GDPR: data minimization

Architectural Tip

The specialized agent should not receive the entire memory — only what's needed for its task.

Three reasons:

Token efficiency: Less context = cheaper and faster calls
Focus: The model performs better with less noise
GDPR: Data minimization — the Finance Agent shouldn't see the customer's full communication history

The Full Agent System Architecture

The components discussed so far form a coherent system:

┌────────────────────────────────────────────────────────────┐
│                     User Interfaces                        │
│         Web Dashboard  │  Mobile App  │  Chat Widget       │
└──────────────────────────┬─────────────────────────────────┘
                           │
┌──────────────────────────▼─────────────────────────────────┐
│                     API Gateway                             │
│              Authentication │ Rate Limiting                  │
└──────────────────────────┬─────────────────────────────────┘
                           │
┌──────────────────────────▼─────────────────────────────────┐
│                   AI Service Layer                          │
│  ┌─────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐  │
│  │ Agent   │  │ Context  │  │  Memory  │  │ Tool       │  │
│  │ Loop    │  │ Builder  │  │ Manager  │  │ Executor   │  │
│  └─────────┘  └──────────┘  └──────────┘  └────────────┘  │
└──────────────────────────┬─────────────────────────────────┘
                           │
        ┌──────────────────┼──────────────────┐
        │                  │                  │
┌───────▼──────┐  ┌───────▼──────┐  ┌───────▼──────┐
│ CRM Tools    │  │ MCP Registry │  │ Knowledge    │
│ contacts     │  │ Gmail        │  │ Graph / RAG  │
│ deals        │  │ Calendar     │  │ Embeddings   │
│ tasks        │  │ Invoicing    │  │ Vector Store │
└──────────────┘  └──────────────┘  └──────────────┘

Agent Loop — Context building → LLM call (provider-agnostic) → tool calls → iteration.

MCP Registry — Dynamic tool registry: plug-and-play, security separation, token-efficient.

Knowledge Graph / RAG — Structured (CRM), semi-structured (emails), unstructured (documents) — vector search provides relevant context.

Next in the series: AI Agent Security and Implementation — GDPR, EU AI Act, approval matrix, and a step-by-step implementation guide.

AI Agent Communication and Memory — Handoff, Shared State, and the 3-Level Memory Model

Why Is Communication Critical?

Communication Patterns

Handoff (Transfer)

Delegation + Result (Tool Call)

Shared State (Common Workspace)

Broadcast (Fan-out)

The Three Levels of Memory

Working Memory

Short-term Memory

Long-term Memory

Architectural Tip

The Full Agent System Architecture

Related Articles

AI Agent Security and Implementation — GDPR, EU AI Act, and Step-by-Step Guide

Multi-Agent Architecture Patterns — Orchestrator, Pipeline, Hierarchical, and Peer-to-Peer

Provider-Agnostic AI Architecture — How to Avoid Getting Locked Into a Single AI Provider

AI Agent Communication and Memory — Handoff, Shared State, and the 3-Level Memory Model

Why Is Communication Critical?

Communication Patterns

Handoff (Transfer)

Delegation + Result (Tool Call)

Shared State (Common Workspace)

Broadcast (Fan-out)

The Three Levels of Memory

Working Memory

Short-term Memory

Long-term Memory

Multi-Agent Memory Sharing

Architectural Tip

The Full Agent System Architecture

Related Articles

AI Agent Security and Implementation — GDPR, EU AI Act, and Step-by-Step Guide

Multi-Agent Architecture Patterns — Orchestrator, Pipeline, Hierarchical, and Peer-to-Peer

Provider-Agnostic AI Architecture — How to Avoid Getting Locked Into a Single AI Provider