NFDI4DS | UHH-SEMS - Publication Details

Fuwei Zhang

ORCID: 0000-0003-0179-9988

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5045437445

Research Areas

Video Analysis and Summarization
Multimodal Machine Learning Applications
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Human Pose and Action Recognition
Data Quality and Management
Music and Audio Processing
Advanced Vision and Imaging
Multimedia Communication and Technology
Machine Learning and Data Classification
Natural Language Processing Techniques

North University of China
2024

Sun Yat-sen University
2022-2024

ERM: Energy-Based Refined-Attention Mechanism for Video Question Answering

OPENALEX - Publications

Fuwei Zhang Ruomei Wang Fan Zhou Yuanmao Luo

Spatiotemporal attention learning remains a challenging video question answering (VideoQA) task as it requires sufficient understanding of cross-modal spatiotemporal information. Existing methods usually leverage different mechanisms to reveal potential associations between and question. While these effectively remove irrelevant information from the attention, they ignore pseudo-related within interaction attention. To address this problem, we proposed novel energy-based refined-attention...

10.1109/tcsvt.2022.3212463 article EN IEEE Transactions on Circuits and Systems for Video Technology 2022-10-05

Video Q &A based on two-stage deep exploration of temporally-evolving features with enhanced cross-modal attention mechanism

OPENALEX - Publications

Yuanmao Luo Ruomei Wang Fuwei Zhang Fan Zhou Mingyang Liu and 1 more

10.1007/s00521-024-09482-8 article EN Neural Computing and Applications 2024-02-27

Modality-aware Heterogeneous Graph for Joint Video Moment Retrieval and Highlight Detection

OPENALEX - Publications

Ruomei Wang Jiawei Feng Fuwei Zhang Xiaonan Luo Yuanmao Luo

The joint task of video moment retrieval and highlight detection is a challenging study, which requires building model that not only captures contextual information between sequences in time but also has the ability to understand judge significance. This paper solves these problems from three aspects. Firstly, we design parameter-free cross-modal statistical correlation interaction method. A novel saliency enhancement function defined quantify differences important features associated with...

10.1109/tcsvt.2024.3389024 article EN IEEE Transactions on Circuits and Systems for Video Technology 2024-04-16

DMAP: Decoupling-Driven Multi-Level Attribute Parsing for Interpretable Outfit Collocation

OPENALEX - Publications

Zhuo Su Yilin Chen Fuwei Zhang Ruomei Wang Fan Zhou and 1 more

Outfit collocation requires considering the interrelationship and adaptability among attributes of component items. However, with numerous diverse fashion items, accurately capturing attribute features modeling complex relationships between become key challenges. To address these challenges, we propose a novel scheme Decoupling-driven Multi-level Attribute Parsing for interpretable outfit collocation. First, decouple series from item's visual feature by fully supervised, which can improve...

10.1109/tmm.2024.3402541 article EN IEEE Transactions on Multimedia 2024-01-01

Subtask Prior-driven Optimized Mechanism on Joint Video Moment Retrieval and Highlight Detection

OPENALEX - Publications

Siyu Zhou Fuwei Zhang Ruomei Wang Fan Zhou Zhuo Su

10.1109/tcsvt.2024.3409897 article EN IEEE Transactions on Circuits and Systems for Video Technology 2024-01-01

HSSHG: Heuristic Semantics-constrained Spatio-temporal Heterogeneous Graph for VideoQA

OPENALEX - Publications

Ruomei Wang Yuanmao Luo Fuwei Zhang M. Liu Xiaonan Luo

10.1109/tmm.2024.3443661 article EN IEEE Transactions on Multimedia 2024-01-01

PSAM: Parameter-Free Spatiotemporal Attention Mechanism for Video Question Answering

OPENALEX - Publications

Fuwei Zhang Ruomei Wang Fan Zhou Yuanmao Luo Jinyu Li

Spatiotemporal attention learning has always been a challenging research task in video question answering (VideoQA). It needs to consider not only the modelling of local neighbourhood dependencies between adjacent frames but also long-term nonadjacent frames. Although existing methods are usually good at temporal one aspect, they cannot simultaneously and effectively model To address this issue, we first derive novel statistic-driven difference-aware generation function, which can...

10.1109/tmm.2023.3333192 article EN IEEE Transactions on Multimedia 2023-11-15

Coming Soon ...