De Cheng

ORCID: 0000-0003-4603-847X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Human Pose and Action Recognition
  • Video Surveillance and Tracking Methods
  • Image Enhancement Techniques
  • COVID-19 diagnosis using AI
  • Advanced Vision and Imaging
  • Advanced Image Processing Techniques
  • Advanced Neural Network Applications
  • Advanced Image and Video Retrieval Techniques
  • Advanced Image Fusion Techniques
  • Video Analysis and Summarization
  • Machine Learning and ELM
  • Advanced optical system design
  • Visual Attention and Saliency Detection
  • Natural Language Processing Techniques
  • Machine Learning and Data Classification
  • Video Coding and Compression Technologies
  • Adaptive optics and wavefront sensing
  • Advanced Data Compression Techniques
  • Advanced Multi-Objective Optimization Algorithms
  • Advanced Measurement and Metrology Techniques
  • Neural Networks and Applications
  • Anomaly Detection Techniques and Applications
  • Optical measurement and interference techniques

Xidian University
2021-2025

Nanjing Normal University
2025

Beijing Institute of Technology
2007-2022

Beijing Institute of Optoelectronic Technology
2022

Huawei Technologies (China)
2020-2021

Hong Kong University of Science and Technology
2021

University of Hong Kong
2021

Beijing Information Science & Technology University
2013

Analysis Group (United States)
2013

Capsule networks (CapsNets) have been known difficult to develop a deeper architecture, which is desirable for high performance in the deep learning era, due complex capsule routing algorithms. In this article, we present simple yet effective algorithm, presented by residual pose routing. Specifically, higher-layer achieved an identity mapping on adjacently lower-layer pose. Such has two advantages: 1) reducing computation complexity and 2) avoiding gradient vanishing its framework. On top...

10.1109/tnnls.2023.3347722 article EN IEEE Transactions on Neural Networks and Learning Systems 2024-01-09

Unsupervised person re-identification (Re-Id) has attracted increasing attention due to its practical application in the read-world video surveillance system. The traditional unsupervised Re-Id are mostly based on method alternating between clustering and fine-tuning with classification or metric learning objectives grouped clusters. However, since is an open-set problem, methods often leave out lots of outlier instances group into wrong clusters, thus they can not make full use training...

10.1109/tip.2022.3169693 article EN IEEE Transactions on Image Processing 2022-01-01

In label-noise learning, estimating the transition matrix has attracted more and attention as plays an important role in building statistically consistent classifiers. However, it is very challenging to estimate T(x), where x denotes instance, because unidentifiable under instance-dependent noise (IDN). To address this problem, we have noticed that, there are psychological physiological evidences showing that humans likely annotate instances of similar appearances same classes, thus...

10.1109/cvpr52688.2022.01613 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Zero-shot learning (ZSL) aims to learn models that can recognize images of semantically related unseen categories, through transferring attribute-based knowledge learned from training data seen classes testing data. As visual attributes play a vital role in ZSL, recent embedding-based methods usually focus on compatibility function between the representation and class semantic attributes. While this work, addition simply region embedding different maintain generalization capability model, we...

10.1109/tcsvt.2023.3243205 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-02-08

Image dehazing is a pivotal preliminary step in the advancement of robust intelligent surveillance system. However, it an extremely challenging ill-posed problem, as faces severe information degradation when accurately restoring clean image from its haze-polluted counterpart. This paper proposes novel Progressive Negative Enhancing (PNE) contrastive learning mechanism to fully exploit various types negative information, thereby facilitating traditional positive-oriented objective function...

10.1109/tmm.2024.3382493 article EN IEEE Transactions on Multimedia 2024-01-01

Zero-shot object detection aims at incorporating class semantic vectors to realize the of (both seen and) unseen classes given an unconstrained test image. In this study, we reveal core challenges in research area: how synthesize robust region features (for objects) that are as intra-class diverse and inter-class separable real samples, so strong detectors can be trained upon them. To address these challenges, build a novel zero-shot framework contains Intra-class Semantic Diverging...

10.1109/cvpr52688.2022.00747 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

With the emergence of large pre-trained vison-language model like CLIP, transferable representations can be adapted to a wide range downstream tasks via prompt tuning. Prompt tuning tries probe beneficial information for from general knowledge stored in model. A recently proposed method named Context Optimization (CoOp) introduces set learnable vectors as text language side. However, alone only adjust synthesized "classifier", while computed visual features image encoder not affected , thus...

10.1109/tmm.2023.3291588 article EN IEEE Transactions on Multimedia 2023-07-03

Current approaches for video grounding propose kinds of complex architectures to capture the video-text relations, and have achieved impressive improvements. However, it is hard learn complicated multi-modal relations by only architecture designing in fact. In this paper, we introduce a novel Support-set Based Cross-Supervision (Sscs) module which can improve existing methods during training phase without extra inference cost. The proposed Sscs contains two main components, i.e.,...

10.1109/iccv48922.2021.01137 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Due to the lack of temporal annotation, current Weakly-supervised Temporal Action Localization (WTAL) methods are generally stuck into over-complete or incomplete localization. In this paper, we aim leverage text information boost WTAL from two aspects, i.e., (a) discriminative objective enlarge inter-class difference, thus reducing over-complete; (b) generative enhance intra-class integrity, finding more complete boundaries. For objective, propose a Text-Segment Mining (TSM) mechanism,...

10.1109/cvpr52729.2023.01026 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

In real-world applications, image degeneration caused by adverse weather is always complex and changes with different conditions from days seasons. Systems in environments constantly encounter that are not previously observed. Therefore, it practically requires removal models to continually learn incrementally collected data reflecting various types. Existing approaches, for either single or multiple weathers, mainly designed a static learning paradigm, which assumes the of all types...

10.1109/tmm.2024.3377136 article EN IEEE Transactions on Multimedia 2024-01-01

Due to the sparse single-frame annotations, current Single-Frame Temporal Action Localization (SF-TAL) methods generally employ threshold-based pseudo-label generation strategies. However, these approaches suffer from inefficient data utilization, as only parts of unlabeled frames with confidence scores surpassing a predefined threshold are selected for training. Moreover, variability annotations and unreliable model predictions introduce noise. To address challenges, we propose two...

10.1109/tip.2024.3378477 article EN IEEE Transactions on Image Processing 2024-01-01

Revealing the characteristics of vegetation coverage change and effects topography on in rural areas since reform opening up has direct implications for further understanding human-land coupling processes resource environmental changes provides a reference ecological environment protection revitalization. Using Taihua Town Yixing City, Jiangsu Province, as case study, we revealed basic features from 1986 to 2020. Furthermore, using 0.25 km

10.13227/j.hjkx.202311008 article EN PubMed 2025-01-08

10.5220/0013319900003890 article EN Proceedings of the 14th International Conference on Agents and Artificial Intelligence 2025-01-01

Multimodal person reidentification (ReID), which aims to learn modality-complementary information by utilizing multimodal images simultaneously for retrieval, is crucial achieving all-time and all-weather monitoring. Existing methods try address this issue through modality fusion absorb complementary information. However, most of these are limited the spatial domain only usually overlook intra-/intermodal interactions during feature fusion, resulting in insufficient learning...

10.1109/tnnls.2025.3544679 article EN IEEE Transactions on Neural Networks and Learning Systems 2025-01-01

A critical challenge for multi-modal Object Re-Identification (ReID) is the effective aggregation of complementary information to mitigate illumination issues. State-of-the-art methods typically employ complex and highly-coupled architectures, which unavoidably result in heavy computational costs. Moreover, significant distribution gap among different image spectra hinders joint representation multimodal features. In this paper, we propose a framework named as PromptMA establish...

10.1109/tip.2025.3556531 article EN IEEE Transactions on Image Processing 2025-01-01

Hyperspectral image super-resolution (HSISR) task has been widely studied, and significant progress made by leveraging the deep convolution neural network (CNN) techniques. Nevertheless, scarcity of training images hinders research HSISR task. Moreover, differences in imaging conditions number spectral bands among different datasets, make it very difficult to construct a unified network. In this paper, we first present non-training based method on prior knowledge, which captures restore high...

10.1109/tgrs.2022.3185647 article EN IEEE Transactions on Geoscience and Remote Sensing 2022-01-01

10.1109/cvpr52733.2024.02227 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16
Coming Soon ...