An Analysis of Fusion Functions for Hybrid Retrieval
FOS: Computer and information sciences
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
0509 other social sciences
Information Retrieval (cs.IR)
Computer Science - Information Retrieval
DOI:
10.1145/3596512
Publication Date:
2023-05-20T08:59:21Z
AUTHORS (3)
ABSTRACT
We study hybrid search in text retrieval where lexical and semantic search are
fused
together with the intuition that the two are complementary in how they model relevance. In particular, we examine fusion by a convex combination of lexical and semantic scores, as well as the reciprocal rank fusion (RRF) method, and identify their advantages and potential pitfalls. Contrary to existing studies, we find RRF to be sensitive to its parameters; that the learning of a convex combination fusion is generally agnostic to the choice of score normalization; that convex combination outperforms RRF in in-domain and out-of-domain settings; and finally, that convex combination is sample efficient, requiring only a small set of training examples to tune its only parameter to a target domain.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (47)
CITATIONS (13)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....