Zezhong Lv

ORCID: 0000-0003-0207-8998
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Video Analysis and Summarization
  • Gaze Tracking and Assistive Technology
  • Advanced Image and Video Retrieval Techniques
  • Visual perception and processing mechanisms
  • Human Pose and Action Recognition
  • Visual Attention and Saliency Detection
  • Image Processing Techniques and Applications
  • Neural dynamics and brain function
  • Digital Imaging for Blood Diseases
  • Domain Adaptation and Few-Shot Learning
  • Cell Image Analysis Techniques

Renmin University of China
2023

Beijing Institute of Big Data Research
2023

Tianjin University
2020-2021

10.1016/j.colsurfa.2024.136061 article EN Colloids and Surfaces A Physicochemical and Engineering Aspects 2025-01-01

Video moment localization aims to retrieve the target segment of an untrimmed video according natural language query. Weakly supervised methods gains attention recently, as precise temporal location is not always available. However, one greatest challenges encountered by weakly method implied in mismatch between and induced coarse annotations. To refine vision-language alignment, recent works contrast cross-modality similarities driven reconstructing masked queries positive negative...

10.1145/3581783.3612495 article EN 2023-10-26

Video sentence grounding aims to localize a segment semantically aligning the given language query from video. Most existing works simply interact video and only once at single early stage. Not multi-level dependencies within videos are not explored since interactions act fixedly on specific level, but also guiding role of is neglected. To tackle these issues, we propose an efficient network namely Temporal-enhanced Cross-modality Fusion Network (TCFN). By directly modulating temporal...

10.1109/icme55011.2023.00257 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

Paired video and language data is naturally temporal concurrency, which requires the modeling of dynamics within each modality alignment across modalities simultaneously. However, most existing video-language representation learning methods only focus on discrete semantic that encourages aligned semantics to be close in latent space, or context dependency captures short-range coherence, failing building concurrency. In this paper, we propose learn representations by pairs as Temporal...

10.1109/iccv51070.2023.01427 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Visual scanning plays an important role in sampling visual information from the surrounding environments for a lot of everyday sensorimotor tasks, such as driving. In this paper, we consider problem mechanism underpinning tasks 3D dynamic environments. We exploit use eye tracking data behaviometric, indicating visuo-motor behavioral measure context virtual A new metric efficiency (VSE), which is defined mathematical divergence between fixation distribution and optical flows induced by...

10.1109/icme51207.2021.9428109 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2021-06-09

Video moment localization aims to retrieve the target segment of an untrimmed video according natural language query. Weakly supervised methods gains attention recently, as precise temporal location is not always available. However, one greatest challenges encountered by weakly method implied in mismatch between and induced coarse annotations. To refine vision-language alignment, recent works contrast cross-modality similarities driven reconstructing masked queries positive negative...

10.48550/arxiv.2308.05648 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Abstract Visual scanning plays an important role in sampling visual information from the surrounding environments for a lot of everyday sensorimotor tasks, such as walking and car driving. In this paper, we consider problem mechanism underpinning tasks 3D dynamic environments. We exploit use eye tracking data behaviometric, indicating visuo-motor behavioral measures context virtual A new metric efficiency ( VSE ), which is defined mathematical divergence between fixation distribution optical...

10.1101/2020.11.17.386185 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-11-18

Abstract Eye movement behavior, which provides the visual information acquisition and processing, plays an important role in performing sensorimotor tasks, such as driving, by human beings everyday life. In procedure of eye is contributed through a specific coordination head gaze changes, with motions preceding movements. Notably we believe that this essence indicates kind causality. paper, investigate transfer entropy to set up quantity for measuring unidirectional causality from motion...

10.1101/2021.03.11.434910 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-03-11
Coming Soon ...