Inflate and Shrink:Enriching and Reducing Interactions for Fast Text-Image Retrieval
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
DOI:
10.18653/v1/2021.emnlp-main.772
Publication Date:
2021-12-17T03:56:42Z
AUTHORS (3)
ABSTRACT
By exploiting the cross-modal attention, cross-BERT methods have achieved state-of-the-art accuracy in retrieval. Nevertheless, heavy text-image interactions model are prohibitively slow for large-scale Late-interaction trade off retrieval and efficiency by interaction only late stage, attaining a satisfactory speed. In this work, we propose an inflating shrinking approach to further boost of late-interaction methods. The operation plugs several codes input encoder exploit more thoroughly higher accuracy. Then gradually reduces through knowledge distilling efficiency. Through followed operation, both boosted. Systematic experiments on public benchmarks demonstrate effectiveness our approach.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (9)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....