Technology

IVF index

The Inverted File Index (IVF) is a core vector database technology that accelerates Approximate Nearest Neighbor (ANN) search by partitioning high-dimensional data into manageable clusters.

IVF is a foundational Approximate Nearest Neighbor (ANN) algorithm, drastically improving search speed in large-scale vector databases. It operates by first using a clustering algorithm—typically k-means—to divide the dataset (e.g., 1 million vectors) into a pre-defined number of clusters (e.g., 1,024), each represented by a centroid. During indexing, every vector is assigned to the cluster with the closest centroid, creating an 'inverted list' structure. When a query vector is submitted, the system only computes distances to the nearest 'n' centroids (e.g., the top 10), then searches solely within those corresponding clusters. This technique bypasses a brute-force scan of the entire 1 million vectors, reducing the search scope to a fraction of the data, which is critical for achieving sub-second latency in applications like image retrieval and semantic text matching.

https://milvus.io/docs/ivf_index.md

1 project · 1 city

Related technologies

HNSW 7 pg_embedding 1 pgvector 21 PostgreSQL 94

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

pg_embedding: Scale AI to millions

Seattle Jul 26

pg_embedding HNSW