Alert button
Picture for Ishwar Bhati

Ishwar Bhati

Alert button

Similarity search in the blink of an eye with compressed indices

Apr 07, 2023
Cecilia Aguerrebere, Ishwar Bhati, Mark Hildebrand, Mariano Tepper, Ted Willke

Figure 1 for Similarity search in the blink of an eye with compressed indices
Figure 2 for Similarity search in the blink of an eye with compressed indices
Figure 3 for Similarity search in the blink of an eye with compressed indices
Figure 4 for Similarity search in the blink of an eye with compressed indices

Nowadays, data is represented by vectors. Retrieving those vectors, among millions and billions, that are similar to a given query is a ubiquitous problem of relevance for a wide range of applications. In this work, we present new techniques for creating faster and smaller indices to run these searches. To this end, we introduce a novel vector compression method, Locally-adaptive Vector Quantization (LVQ), that simultaneously reduces memory footprint and improves search performance, with minimal impact on search accuracy. LVQ is designed to work optimally in conjunction with graph-based indices, reducing their effective bandwidth while enabling random-access-friendly fast similarity computations. Our experimental results show that LVQ, combined with key optimizations for graph-based indices in modern datacenter systems, establishes the new state of the art in terms of performance and memory footprint. For billions of vectors, LVQ outcompetes the second-best alternatives: (1) in the low-memory regime, by up to 20.7x in throughput with up to a 3x memory footprint reduction, and (2) in the high-throughput regime by 5.8x with 1.4x less memory.

Viaarxiv icon