Question 1

How does the retrieval pipeline work?

Accepted Answer

Every query passes through a multi-stage pipeline: spell correction (~10 ms, pyspellchecker with 120K-word dictionary), query rewriting (resolves pronouns via conversation history), strategy routing (selects from 4 retrieval strategies based on complexity), 3-way hybrid search (semantic embeddings + BM25 full-text + ColPali visual), Cohere reranking with Cross-Encoder fallback, and a semantic cache with 0.95 similarity threshold and 24-hour TTL for instant repeat-query responses.

Question 2

What retrieval strategies are available?

Accepted Answer

Four strategies adapt to query complexity. Simple performs a single-pass semantic search for straightforward lookups. Multi-hop uses iterative query expansion with HyDE (hypothetical document embeddings) for nuanced questions. Iterative progressively refines results across multiple rounds. Sub-question decomposes complex queries into independent sub-parts, retrieves for each, then synthesizes a unified answer. The QueryRouter selects the best strategy automatically.

Question 3

Does Leepi.ai support multi-document queries?

Accepted Answer

Yes. When a query spans multiple documents, the context builder groups retrieved chunks by document and inserts clear document headers between groups. A per-document token cap (default 70%) prevents any single source from dominating the context window. The LLM prompt includes explicit instructions to attribute values by document name and flag conflicts when sources disagree.

RAG Pipeline

Key capabilities

Frequently asked questions

How does the retrieval pipeline work?

What retrieval strategies are available?

Does Leepi.ai support multi-document queries?

Try it yourself

RAG Pipeline

Key capabilities

Frequently asked questions

How does the retrieval pipeline work?

What retrieval strategies are available?

Does Leepi.ai support multi-document queries?

Try it yourself