When Search Doesn't Understand What We're Looking For
Imagine: an AI assistant is asked — "When was Kiss Anna's last visit?"
Traditional keyword search returns nothing. The word "last" doesn't appear in any calendar entry. Semantic search, however, understands the meaning: it finds Kiss Anna's most recent calendar entry, because it recognizes that "last visit" = most recent appointment.
This is the difference between word-matching and meaning-matching — and it's the foundation of every modern AI-powered search architecture.
What Is an Embedding?
An embedding transforms text into a numerical vector — typically in a 256-3072 dimensional space. The key idea: semantically similar texts produce vectors that are close together, while different ones are far apart.
This is essentially how machines "understand" meaning — they don't compare letters, they compare concepts.
"When was Kiss Anna's last visit?" → [0.23, -0.41, 0.87, ...] ←─┐
│ close!
"Kiss Anna's most recent appointment" → [0.25, -0.39, 0.85, ...] ←─┘
"Marketing budget 2026" → [0.71, 0.12, -0.33, ...] ← far away
The two similar questions produce nearly identical vectors, while the unrelated text's vector is far away. Similarity is measured using cosine similarity: 1 = perfectly similar, 0 = no relationship.
Why Isn't Vector Search Enough on Its Own?
Semantic search finds similar content — but business questions often require relationships:
- "When was Kiss Anna's last visit, and what did we do?" → Need the calendar event + client data + notes
- "How much did she spend in March?" → Need the client + invoices + bookings
This is where a knowledge graph comes in: business entities (email, calendar, client, invoice) are represented as nodes, the relationships between them as edges — and search uses both.
pgvector — Vector Search in Your Existing PostgreSQL
No need for a separate vector database (Pinecone, Qdrant). If you already have PostgreSQL, the pgvector extension is free and can be enabled with a single CREATE INDEX — keeping vectors in the same database as your business data.
This means: a single SQL query can perform vector search + graph traversal + tenant filtering, with zero network latency between systems.
At enterprise scale (100M+ vectors), dedicated solutions win — but for most SMEs and mid-market SaaS, pgvector is more than sufficient, and significantly simpler to operate.
The RAG Pipeline in Brief — 5 Steps
The Retrieval-Augmented Generation (RAG) pipeline connects search to the LLM:
- Input validation — Short messages (1-2 characters) carry no semantic content, filter them out
- Vector search — Compare the question's embedding against database vectors (cosine similarity > 0.60, top-8)
- Graph enrichment — Load the top-3 results' neighbors (1-hop neighbors, 0.8 decay factor)
- Deduplication + token budget — Unique nodes, ranked by relevance, within a 3000-token limit
- Format + inject — Markdown context, grouped by type, into the LLM system message
The result: the LLM doesn't hallucinate — it responds based on real business data, with source attribution.
3 Practical Takeaways
1. Start Simple
pgvector + OpenAI text-embedding-3-small + cosine search — this works in 30 minutes and is sufficient for most SME use cases. Don't over-engineer!
2. Don't Chunk What's Already a Natural Unit
Emails, calendar events, and client profiles are natural units — no need to split them into 500-token chunks. The entity-based approach (1 business object = 1 node + 1 embedding) is simpler and produces better results.
3. Graph Enrichment Is the Real Differentiator
For "Who?" "When?" "How much?" questions, vector search alone is weak — loading neighbors brings dramatic quality improvement at minimal additional cost.
Want to Go Deeper?
This article is a condensed version of our Semantic Search and Embedding Strategies — Whitepaper. The full whitepaper covers 15 chapters in detail: embedding model comparison, pgvector indexing (IVFFlat vs. HNSW), BullMQ async pipeline, hybrid search (RRF), re-ranking, RAGAS evaluation, GraphRAG, production monitoring, and embedding drift management.
Want to implement semantic search in your own system? The Atlosz Interactive team has production experience with pgvector, knowledge graph, and RAG pipeline architecture. Get in touch for a free technical consultation!