NFDI4DS | UHH-SEMS - Publication Details

Deep learning enables label-free nanoparticle localization from bright-field microscopy images

OPENALEX - Publications

Zezhong Lv Bing Su Xia Xu Wei Li Wei Cui

10.1016/j.colsurfa.2024.136061 article EN Colloids and Surfaces A Physicochemical and Engineering Aspects 2025-01-01

Variational global clue inference for weakly supervised video moment retrieval

OPENALEX - Publications

Zezhong Lv Bing Su

10.1016/j.knosys.2025.113071 article EN Knowledge-Based Systems 2025-01-01

Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization

OPENALEX - Publications

Zezhong Lv Bing Su Ji-Rong Wen

Video moment localization aims to retrieve the target segment of an untrimmed video according natural language query. Weakly supervised methods gains attention recently, as precise temporal location is not always available. However, one greatest challenges encountered by weakly method implied in mismatch between and induced coarse annotations. To refine vision-language alignment, recent works contrast cross-modality similarities driven reconstructing masked queries positive negative...

10.1145/3581783.3612495 article EN 2023-10-26

Temporal-enhanced Cross-modality Fusion Network for Video Sentence Grounding

OPENALEX - Publications

Zezhong Lv Bing Su

Video sentence grounding aims to localize a segment semantically aligning the given language query from video. Most existing works simply interact video and only once at single early stage. Not multi-level dependencies within videos are not explored since interactions act fixedly on specific level, but also guiding role of is neglected. To tackle these issues, we propose an efficient network namely Temporal-enhanced Cross-modality Fusion Network (TCFN). By directly modulating temporal...

10.1109/icme55011.2023.00257 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

Exploring Temporal Concurrency for Video-Language Representation Learning

OPENALEX - Publications

Heng Zhang Daqing Liu Zezhong Lv Bing Su Dacheng Tao

Paired video and language data is naturally temporal concurrency, which requires the modeling of dynamics within each modality alignment across modalities simultaneously. However, most existing video-language representation learning methods only focus on discrete semantic that encourages aligned semantics to be close in latent space, or context dependency captures short-range coherence, failing building concurrency. In this paper, we propose learn representations by pairs as Temporal...

10.1109/iccv51070.2023.01427 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

A Jensen-Shannon Divergence Driven Metric of Visual Scanning Efficiency Indicates Performance of Virtual Driving

OPENALEX - Publications

Zezhong Lv Qing Xu Klaus Schoeffmann Simon Parkinson

Visual scanning plays an important role in sampling visual information from the surrounding environments for a lot of everyday sensorimotor tasks, such as driving. In this paper, we consider problem mechanism underpinning tasks 3D dynamic environments. We exploit use eye tracking data behaviometric, indicating visuo-motor behavioral measure context virtual A new metric efficiency (VSE), which is defined mathematical divergence between fixation distribution and optical flows induced by...

10.1109/icme51207.2021.9428109 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2021-06-09

Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization

OPENALEX - Publications

Zezhong Lv Bing Su Ji-Rong Wen

Video moment localization aims to retrieve the target segment of an untrimmed video according natural language query. Weakly supervised methods gains attention recently, as precise temporal location is not always available. However, one greatest challenges encountered by weakly method implied in mismatch between and induced coarse annotations. To refine vision-language alignment, recent works contrast cross-modality similarities driven reconstructing masked queries positive negative...

10.48550/arxiv.2308.05648 preprint EN other-oa arXiv (Cornell University) 2023-01-01

New Measures of Visual Scanning Efficiency and Cognitive Effort

OPENALEX - Publications

Zezhong Lv Qing Xu Klaus Schoeffmann Simon Parkinson

Abstract Visual scanning plays an important role in sampling visual information from the surrounding environments for a lot of everyday sensorimotor tasks, such as walking and car driving. In this paper, we consider problem mechanism underpinning tasks 3D dynamic environments. We exploit use eye tracking data behaviometric, indicating visuo-motor behavioral measures context virtual A new metric efficiency ( VSE ), which is defined mathematical divergence between fixation distribution optical...

10.1101/2020.11.17.386185 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-11-18

Transfer Entropy Based Causality From Head Motion To Eye Movement

OPENALEX - Publications

Zezhong Lv Qing Xu Klaus Schoeffmann Simon Parkinson

Abstract Eye movement behavior, which provides the visual information acquisition and processing, plays an important role in performing sensorimotor tasks, such as driving, by human beings everyday life. In procedure of eye is contributed through a specific coordination head gaze changes, with motions preceding movements. Notably we believe that this essence indicates kind causality. paper, investigate transfer entropy to set up quantity for measuring unidirectional causality from motion...

10.1101/2021.03.11.434910 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-03-11