The vector database landscape
A vector database stores embedding vectors and enables fast similarity search over them. That is the core job. But at enterprise scale, the differences between vector databases matter enormously -- in performance, operational complexity, cost, and the features that make production deployment viable.
Here is the landscape as it stands in 2026, focused on the options that are deployable on your own infrastructure.
Milvus (LF AI & Data Foundation, Apache 2.0). The most mature open-source vector database for large-scale deployment. Written in Go with a distributed architecture that separates compute, storage, and coordination. Natively supports sharding across nodes, GPU-accelerated indexing and search, and hybrid dense+sparse search. Milvus is the default choice when you know you will exceed a single node's capacity. The trade-off: operational complexity. A production Milvus cluster involves etcd for metadata, MinIO or S3 for storage, and multiple query/data/index nodes. You are operating a distributed system with all the associated monitoring, failover, and upgrade complexity.
Qdrant (Apache 2.0). Written in Rust, optimised for single-node performance with horizontal scaling when needed. Qdrant's standout feature is its filtering performance -- metadata filters are applied during the search itself (not as a post-filter), which makes filtered queries almost as fast as unfiltered ones. It supports quantisation (scalar and product quantisation) to reduce memory usage, and on-disk storage for vectors that do not fit in RAM. Simpler to operate than Milvus for deployments up to a few hundred million vectors.
Weaviate (BSD-3-Clause). Written in Go. Weaviate distinguishes itself with built-in "modules" for vectorisation, reranking, and generative search -- you can configure the entire RAG pipeline within Weaviate. This is convenient for prototyping but can be limiting in production where you want control over each pipeline stage. Strong multi-tenancy support. Its HNSW implementation is well-optimised, and it supports flat, dynamic, and HNSW indexes.
pgvector (PostgreSQL extension, BSD). If you already run PostgreSQL, pgvector adds vector similarity search without introducing a new database. It supports HNSW and IVFFlat indexes. Performance is good for up to 10-20 million vectors. Beyond that, dedicated vector databases significantly outperform it. The advantage: you get vector search, relational queries, ACID transactions, and your existing PostgreSQL expertise in one system. The disadvantage: at scale, it cannot match the throughput of purpose-built vector databases.
Chroma (Apache 2.0). An embedded vector database designed for simplicity. Excellent for prototyping and small deployments (up to a few million vectors). Not designed for distributed deployment or billion-vector scale. Think of it as SQLite for vectors -- perfect when simplicity matters more than scale.