Knowledge Graph
Automatically maps people, places, concepts, and their relationships across your entire document library.
Key capabilities
Frequently asked questions
How are entities extracted?
Leepi.ai uses a universal NER model with 200 entity types across 18 domains including medical, legal, financial, tech, science, geography, and more. The model is a single custom-trained spaCy pipeline with a hybrid architecture: an EntityRuler with 1,900+ gazetteer patterns and 8 regex rules runs before a trained NER component built on en_core_web_lg with 560K word vectors. It runs at ~70 MB instead of the 430 MB required by the three separate models it replaced.
What is entity resolution?
Entity resolution merges different mentions of the same real-world entity. Leepi.ai uses a 3-tier chain: exact string match first, then fuzzy matching via RapidFuzz (Levenshtein distance) for near-matches like typos, and finally embedding similarity using OpenAI embeddings for semantically equivalent names. This ensures "IBM", "International Business Machines", and "I.B.M." all resolve to the same entity node in the graph.
How does Graph-RAG work?
When Graph-RAG is enabled, entity relationships in the knowledge graph expand search queries to find related context that keyword search alone would miss. For example, if a document mentions a drug name, the graph can surface related entities like its manufacturer, known side effects, or regulatory approvals from other documents. Co-occurrence edges are weighted using PMI (pointwise mutual information) to rank relationship strength.