Relevance maximization for high-recall retrieval problem: finding all needles in a haystack
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
DOI:
10.1007/s11227-016-1956-8
Publication Date:
2017-01-10T11:19:14Z
AUTHORS (2)
ABSTRACT
High-recall retrieval problem, aiming at finding the full set of relevant documents in a huge result set by effective mining techniques, is particularly useful for patent information retrieval, legal document retrieval, medical document retrieval, market information retrieval, and literature review. The existing high-recall retrieval methods, however, have been far from satisfactory to retrieve all relevant documents due to not only high-recall and precision threshold measurements but also a sheer minimize the number of reviewed documents. To address this gap, we generalize the problem to a novel high-recall retrieval model, which can be represented as finding all needles in a giant haystack. To compute candidate groups consisting of k relevant documents efficiently, we propose dynamic diverse retrieval algorithms specialized for the patent-searching method, in which an effective dynamic interactive retrieval can be achieved. In the various types of datasets, the dynamic ranking method shows considerable improvements with respect to time and cost over the conventional static ranking approaches.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (36)
CITATIONS (6)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....