Feng Zhu

ORCID: 0009-0002-5759-9176
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Multimodal Machine Learning Applications
  • Advanced Image and Video Retrieval Techniques
  • Video Surveillance and Tracking Methods
  • Face recognition and analysis
  • Human Pose and Action Recognition
  • Adversarial Robustness in Machine Learning
  • Diverse Approaches in Healthcare and Education Studies
  • Data Visualization and Analytics
  • Advanced Chemical Sensor Technologies
  • Air Quality Monitoring and Forecasting
  • Cancer-related molecular mechanisms research
  • Emotion and Mood Recognition
  • Image Retrieval and Classification Techniques
  • Gait Recognition and Analysis
  • Face and Expression Recognition
  • Robotics and Sensor-Based Localization

Group Sense (China)
2019-2025

Beijing Sport University
2024

Recently low-bit (e.g., 8-bit) network quantization has been extensively studied to accelerate the inference. Besides inference, training with quantized gradients can further bring more considerable acceleration, since backward process is often computation-intensive. Unfortunately, inappropriate of propagation usually makes unstable and even crash. There lacks a successful unified framework that support diverse networks on various tasks. In this paper, we give an attempt build 8-bit (INT8)...

10.1109/cvpr42600.2020.00204 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

The pretrain-finetune paradigm is a classical pipeline in visual learning. Recent progress on unsupervised pretraining methods shows superior transfer performance to their supervised counterparts. This paper revisits this phenomenon and sheds new light understanding the transferability gap between from multilayer perceptron (MLP) perspective. While previous works [6], [8], [17] focus effectiveness of MLP image classification where evaluation are conducted same dataset, we reveal that...

10.1109/cvpr52688.2022.00897 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Visual relationships are crucial for visual perception and reasoning, cover tasks like Scene Graph Generation, Human-Object Interaction, object affordance. Despite significant efforts, this field still suffers from the following limitations: specialists a specific task without considering similar ones, strict complex formulations with limited flexibility, underexploited reasoning language knowledge. To solve these limitations, we seek to build new framework, one model all tasks, over Large...

10.1109/tpami.2025.3531452 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2025-01-01

For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time both users model owners, these usually form a sequence. Therefore, under such setting, selective expected be continuously removed while maintaining rest. We define this problem as continual forgetting identify three key challenges. (i) knowledge, efficient effective deleting crucial. (ii)...

10.48550/arxiv.2501.09705 preprint EN arXiv (Cornell University) 2025-01-16

Recently, person re-identification (ReID) has witnessed fast development due to its broad practical applications and proposed various settings, e.g., traditional ReID, clothes-changing visible-infrared ReID. However, current studies primarily focus on single specific tasks, which limits model applicability in real-world scenarios. This paper aims address this issue by introducing a novel instruct-ReID task that unifies 6 existing ReID tasks one retrieves images based provided visual or...

10.1109/tpami.2025.3538766 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2025-01-01

In this paper, we propose an online multi-object tracking (MOT) approach that integrates data association and single object (SOT) with a unified convolutional network (ConvNet), named DASOTNet. The intuition behind integrating SOT is they can complement each other. Following Siamese architecture, DASOTNet consists of the shared feature ConvNet, branch branch. Data treated as special re-identification task solved by learning discriminative features for different targets in To handle problem...

10.1609/aaai.v34i07.6694 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Recent years person re-identification (ReID) has been developed rapidly due to its broad practical applications. Most existing benchmarks assume that the same wears clothes across captured images, while, in real-world scenarios, may change his/her frequently. Thus Clothes-Changing ReID (CC-ReID) problem is introduced and several related are established. CC-ReID a very difficult task as main visual characteristics of human body, clothes, different between query gallery, clothes-irrelevant...

10.1109/tcsvt.2022.3216769 article EN IEEE Transactions on Circuits and Systems for Video Technology 2022-11-03

Exploiting the relationships between attributes is a key challenge for improving multiple facial attribute recognition. In this work, we are concerned with two types of correlations that spatial and non-spatial relationships. For correlation, aggregate similarity into part-based group then introduce Group Attention Learning to generate attention feature. On other hand, discover relationship, model group-based Graph Correlation explore affinities predefined groups. We utilize such affinity...

10.1109/icme51207.2021.9428078 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2021-06-09

Recently low-bit (e.g., 8-bit) network quantization has been extensively studied to accelerate the inference. Besides inference, training with quantized gradients can further bring more considerable acceleration, since backward process is often computation-intensive. Unfortunately, inappropriate of propagation usually makes unstable and even crash. There lacks a successful unified framework that support diverse networks on various tasks. In this paper, we give an attempt build 8-bit (INT8)...

10.48550/arxiv.1912.12607 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Video-text retrieval has greatly benefited from the massive web video in recent years, while performance is still limited to weak supervision uncurated data. In this work, we propose leverage well-represented information of each original modality and exploit complementary two views same video, i.e., clips captions, by using one view obtain positive samples with neighboring other. Respecting hierarchical organization real-world data, further design a cross-modal pre-training method (HCP)...

10.1109/icassp49357.2023.10097061 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Person clustering with multi-modal clues, including faces, bodies, and voices, is critical for various tasks, such as movie parsing identity-based editing. Related methods multi-view mainly project features into a joint feature space. However, clue are usually rather weakly correlated due to the semantic gap from modality-specific uniqueness. As result, these not suitable person clustering. In this paper, we propose <bold xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/tmm.2023.3304454 article EN IEEE Transactions on Multimedia 2023-08-22

The pretrain-finetune paradigm is a classical pipeline in visual learning. Recent progress on unsupervised pretraining methods shows superior transfer performance to their supervised counterparts. This paper revisits this phenomenon and sheds new light understanding the transferability gap between from multilayer perceptron (MLP) perspective. While previous works focus effectiveness of MLP image classification where evaluation are conducted same dataset, we reveal that projector also key...

10.48550/arxiv.2112.00496 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...