Wentao Ma

ORCID: 0000-0003-3059-6629
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Digital Media Forensic Detection
  • Advanced Steganography and Watermarking Techniques
  • Human Pose and Action Recognition
  • Image Processing Techniques and Applications
  • Remote-Sensing Image Classification
  • Chaos-based Image/Signal Encryption
  • Face recognition and analysis
  • Advanced Image Processing Techniques
  • Speech Recognition and Synthesis
  • Advanced Neural Network Applications
  • Text and Document Classification Technologies
  • Advanced Adaptive Filtering Techniques
  • Image and Video Stabilization
  • Digital Holography and Microscopy
  • Remote Sensing and Land Use
  • Advanced Technologies in Various Fields
  • Parallel Computing and Optimization Techniques
  • Advanced Image Fusion Techniques
  • Seismology and Earthquake Studies
  • Image Enhancement Techniques
  • Cell Image Analysis Techniques

Anhui Agricultural University
2023-2025

Ocean University of China
2025

National University of Defense Technology
2022-2023

Hunan University
2023

Xidian University
2021

Inner Mongolia University of Technology
2021

Central South University
2018-2020

Central South University of Forestry and Technology
2018-2020

Xi'an Jiaotong University
2015

Shanghai Jiao Tong University
2010

The encrypted image retrieval in cloud computing is a key technology to realize the massive images of storage and management safety. In this paper, novel feature extraction method for proposed. First, improved Harris algorithm used extract features. Next, Speeded-Up Robust Features Bag Words model are applied generate vectors each image. Then, Local Sensitive Hash construct searchable index vectors. chaotic encryption scheme utilized protect indexes security. Finally, secure similarity...

10.1109/access.2019.2894673 article EN cc-by-nc-nd IEEE Access 2019-01-01

The rapid development of big data and cloud computing technologies greatly accelerate the spreading utilization images videos. copyright protection for videos is becoming increasingly serious. In this paper, we proposed robust non-blind watermarking schemes in YCbCr color space based on channel coding. source watermark image encoded singular value decomposed. Subsequently, matrixes are embedded into Y, Cb, Cr components host after four-level discrete wavelet transform (DWT). embedding factor...

10.1109/access.2019.2896304 article EN cc-by-nc-nd IEEE Access 2019-01-01

Recently, character-word lattice structures have achieved promising results for Chinese named entity recognition (NER), reducing word segmentation errors and increasing boundary information character sequences. However, constructing the structure is complex time-consuming, thus these lattice-based models usually suffer from low inference speed. Moreover, quality of lexicon affects accuracy NER model. Since noise words can potentially confuse NER, limited coverage cause to degenerate into...

10.1109/tnnls.2025.3528416 article EN IEEE Transactions on Neural Networks and Learning Systems 2025-01-01

The education plays a more and important role in disseminating knowledge because of the explosive growth knowledge. As one kind carrier delivering knowledge, image also presents an trend increasingly education, medical, advertising, entertainment, so on. Aiming at long time massive feature extraction construction smart campus, traditional Harris corner has problems, such as low detection efficiency many non-maximal pseudocorner points. This paper proposes matching method that combines...

10.1109/access.2018.2878147 article EN cc-by-nc-nd IEEE Access 2018-01-01

Cross-modal retrieval aims to enable a flexible bi-directional experience across different modalities (e.g., searching for videos with texts). Many existing efforts tend learn common semantic representation embedding space in which items of can be directly compared, wherein the positive global representations video-text pairs are pulled close while negative ones pushed apart via pair-wise ranking loss. However, such vanilla loss would unfortunately yield ambiguous feature embeddings texts...

10.1109/tcsvt.2023.3257193 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-03-14

Text-based person re-identification (ReID) has enabled canonical applications in searching for and tracking targets from large-scale surveillance images with textual descriptions. Yet, existing text-based ReID systems employ centralized model training that gathers captured by different institutes' cameras into one place, which poses severe privacy threats to sensitive institutional information. This work is then devoted exploring privacy-preserving proposes the framework of FedSH tailoring...

10.1109/tmm.2023.3330091 article EN IEEE Transactions on Multimedia 2023-11-06

Recent image manipulation detection approaches primarily rely on sophisticated Convolutional Neural Network (CNN)-based models for region localization, while they tend to ignore: 1) the feature correlations that exist between manipulated and non-manipulated regions. 2) significance of multi-scale representations in detecting regions varying sizes, consequently hampering overall performance detection. To address these limitations, we propose a novel approach, called Cascade Hierarchical Graph...

10.1109/tcsvt.2024.3390127 article EN IEEE Transactions on Circuits and Systems for Video Technology 2024-04-17

Multimodal named entity recognition (MNER) is an emerging field that aims to automatically detect entities and classify their categories, utilizing input text auxiliary resources such as images. While previous studies have leveraged object detectors preprocess images fuse textual semantics with corresponding image features, these methods often overlook the potential finer grained information within each modality may exacerbate error propagation due predetection. To address issues, we propose...

10.1109/tnnls.2025.3528567 article EN IEEE Transactions on Neural Networks and Learning Systems 2025-01-01

10.1109/icassp49660.2025.10890593 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Recently, a hierarchical fine-grained fusion mechanism has been proved effective in cross-modal retrieval between videos and texts. Generally, the semantic representations (video-text matching is decomposed into three levels including global-event representation matching, action-relation local-entity matching) to be fused can work well by themselves for query. However, real-world scenarios applications, existing methods failed adaptively estimate effectiveness of multiple given query advance...

10.1109/tnnls.2022.3214208 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-11-03

Image–text cross-modal retrieval aims to bridge the semantic gap between different modalities, allowing for search of images based on textual descriptions or vice versa. Existing efforts in this field concentrate coarse-grained feature representation and then utilize pairwise ranking loss pull image–text positive pairs closer, pushing negative ones apart. However, using directly lacks reliability as it disregards fine-grained information, posing a challenge narrowing image text. To end, we...

10.3390/electronics13020300 article EN Electronics 2024-01-09

Recently, searchable encrypted image retrieval in a cloud environment has been widely studied. However, the inappropriate encryption mechanism and single feature description make it hard to achieve expected effects. Therefore, major challenge of is how extract fuse multiple efficient features improve performance. Towards this end, paper proposes based on multi-feature adaptive late-fusion environment. Firstly, completed by designing function an RGB color channel, bit plane pixel position...

10.3390/math8061019 article EN cc-by Mathematics 2020-06-22

Change detection for remote sensing images is an indispensable procedure many applications, such as geological disaster assessment, environmental monitoring, and urban development monitoring. Through this technique, the difference in certain areas after some emergencies can be determined to estimate their influence. Additionally, by analyzing sequential maps, change tendency found help predict future changes, pollution. The complex variety of changes interferential caused imaging processing,...

10.3390/rs13224597 article EN cc-by Remote Sensing 2021-11-16

Leveraging trace-rich features within embedded spaces has been established as effective in image manipulation localization (IML). Nevertheless, the feature of manipulated traces frequently comprises substantial redundant information only loosely related to IML tasks. This complexity hindered existing methods fully comprehending essence trace features. In light this challenge, we introduce a novel decoupling representation learning network (DRN) tailored for IML. The DRN excels at intricate...

10.1109/tnnls.2024.3472846 article EN IEEE Transactions on Neural Networks and Learning Systems 2024-01-01

With the support of deep neural networks, existing image manipulation detection (IMD) methods can detect manipulated regions within a suspicious effectively. In general, operations (e.g., splicing, copy-move, and removal) tend to leave artifacts in high-frequency domain image, which provides rich clues for locating regions. Inspired by this phenomenon, paper, we propose High-Frequency Component Enhancement Network, short HFCE-Net, detection, aims fully explore left improve localization...

10.3390/electronics13020447 article EN Electronics 2024-01-21
Coming Soon ...