Where vector search fails
Vector search finds documents that are semantically similar to a query. This is powerful for most questions, but it fails systematically for a specific category: relationship queries.
Consider these questions:
- "Who approved the change order that introduced the $2M liability?"
- "Which contracts reference the subsidiary that was acquired in 2024?"
- "Show me all projects managed by people who report to Sarah Chen."
- "What vendors are connected to the procurement irregularity flagged in the audit?"
These questions are not about finding a document -- they are about traversing relationships between entities. The answer is not in any single chunk; it emerges from connecting information scattered across multiple documents.
Vector search cannot traverse relationships. It can find documents that mention "change order" and documents that mention "$2M liability," but it cannot connect them through the approval chain. A query like "Who approved the change order that introduced the $2M liability?" requires:
- Finding the change order that introduced the $2M liability (which might be in one document)
- Finding who approved that specific change order (which might be in a different document)
- Connecting the two through a shared identifier (the change order number)
This is a graph problem, not a similarity problem. And for enterprises with complex organisational structures, contractual relationships, and regulatory obligations, these graph queries are some of the highest-value questions the RAG system needs to answer.