NFDI4DS | UHH-SEMS - Publication Details

Bo Lv

ORCID: 0000-0002-2499-0948

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5020575658

Research Areas

Multimodal Machine Learning Applications
Advanced Image and Video Retrieval Techniques
Natural Language Processing Techniques
Domain Adaptation and Few-Shot Learning
Generative Adversarial Networks and Image Synthesis
Topic Modeling
Computer Graphics and Visualization Techniques

China Electronics Technology Group Corporation
2022-2023

Semantic Completion and Filtration for Image–Text Retrieval

OPENALEX - Publications

Song Yang Qiang Li Wenhui Li Xuanya Li Ran Jin and 3 more

Image–text retrieval is a vital task in computer vision and has received growing attention, since it connects cross-modality data. It comes with the critical challenges of learning unified representations eliminating large gap between visual textual domains. Over past few decades, although many works have made significant progress image–text retrieval, they are still confronted challenge incomplete text descriptions images, i.e., how to fully learn correlations relevant region–word pairs...

10.1145/3572844 article EN ACM Transactions on Multimedia Computing Communications and Applications 2022-11-23

CD-GAN: Commonsense-Driven Generative Adversarial Network with Hierarchical Refinement for Text-to-Image Synthesis

OPENALEX - Publications

Guokai Zhang Ning Xu Chenggang Yan Bolun Zheng Yulong Duan and 2 more

Synthesizing vivid images with descriptive texts is gradually emerging as a frontier cross-domain generation task. However, it obviously inadequate to generate the high-quality image one single sentence accurately due information asymmetry between modalities, which needs external knowledge balance process. Moreover, limited description of entities in cannot guarantee semantic consistency text and generated image, causing deficiency details foreground background. Here, we propose...

10.34133/icomputing.0017 article EN cc-by Intelligent Computing 2023-01-01

Knowledge Prompt Makes Composed Pre-Trained Models Zero-Shot News Captioner

OPENALEX - Publications

Yanhui Wang Ning Xu Hongshuo Tian Bo Lv Yulong Duan and 2 more

News image captioning aims to generate descriptions containing concrete named entities for news images by leveraging relevant articles. However, existing approaches suffer from two shortcomings: 1) lack of commonsense knowledge required understand entities, and 2) limited multimodal context modeling capabilities. In this paper, we propose migrate the ability large-scale pre-trained models captioning. To acquire factual describing induce a language model reasoning using context-aware prompts....

10.1109/icme55011.2023.00489 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

Coming Soon ...