Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search

FOS: Computer and information sciences Computer Science - Machine Learning Information Retrieval (cs.IR) Computer Science - Information Retrieval Machine Learning (cs.LG)
DOI: 10.48550/arxiv.2405.12207 Publication Date: 2024-05-20
ABSTRACT
Clustering-based nearest neighbor search is a simple yet effective method in which data points are partitioned into geometric shards to form an index, and only few searched during query processing find approximate set of top-$k$ vectors. Even though the efficacy heavily influenced by algorithm that identifies probe, it has received little attention literature. This work attempts bridge gap studying problem routing clustering-based maximum inner product (MIPS). We begin unpacking existing protocols notice surprising contribution optimism. then take page from sequential decision making literature formalize insight following principle ``optimism face uncertainty.'' In particular, we present new framework incorporates moments distribution products within each shard optimistically estimate product. instance our uses first two reach same accuracy as state-of-the-art routers such \scann probing up $50%$ fewer on suite benchmark MIPS datasets. Our also space-efficient: design sketch second moment whose size independent number practice requires storing $O(1)$ additional vectors per shard.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....