NFDI4DS | UHH-SEMS - Publication Details

Nenghai Yu

ORCID: 0000-0003-4417-9316

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5064573190

Research Areas

Advanced Steganography and Watermarking Techniques
Digital Media Forensic Detection
Chaos-based Image/Signal Encryption
Advanced Image and Video Retrieval Techniques
Adversarial Robustness in Machine Learning
Generative Adversarial Networks and Image Synthesis
Video Surveillance and Tracking Methods
Face recognition and analysis
Advanced Neural Network Applications
Anomaly Detection Techniques and Applications
Image Retrieval and Classification Techniques
Domain Adaptation and Few-Shot Learning
Cryptography and Data Security
Internet Traffic Analysis and Secure E-voting
Multimodal Machine Learning Applications
Human Pose and Action Recognition
Quantum Information and Cryptography
Video Analysis and Summarization
Privacy-Preserving Technologies in Data
Advanced Image Processing Techniques
Quantum Computing Algorithms and Architecture
Advanced Vision and Imaging
Advanced Malware Detection Techniques
Image Enhancement Techniques
Face and Expression Recognition

University of Science and Technology of China
2016-2025

Chinese Academy of Sciences
2015-2024

Hefei University
2023-2024

Hefei Institutes of Physical Science
2016-2023

Fordham University
2021-2023

National Engineering Research Center of Electromagnetic Radiation Control Materials
2023

China Southern Power Grid (China)
2023

University of Alabama in Huntsville
2021

City University of Hong Kong
2020

King University
2016

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows

OPENALEX - Publications

Xiaoyi Dong Jianmin Bao Dongdong Chen Weiming Zhang Nenghai Yu and 3 more

We present CSWin Transformer, an efficient and effective Transformer-based backbone for general-purpose vision tasks. A challenging issue in Transformer design is that global self-attention very expensive to compute whereas local often limits the field of interactions each token. To address this issue, we develop Cross-Shaped Window mechanism computing horizontal vertical stripes parallel form a cross-shaped window, with stripe obtained by splitting input feature into equal width. provide...

10.1109/cvpr52688.2022.01181 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Reversible Data Hiding in Encrypted Images by Reserving Room Before Encryption

OPENALEX - Publications

Kede Ma Weiming Zhang Xianfeng Zhao Nenghai Yu Fenghua Li

Recently, more and attention is paid to reversible data hiding (RDH) in encrypted images, since it maintains the excellent property that original cover can be losslessly recovered after embedded extracted while protecting image content's confidentiality. All previous methods embed by reversibly vacating room from which may subject some errors on extraction and/or restoration. In this paper, we propose a novel method reserving before encryption with traditional RDH algorithm, thus easy for...

10.1109/tifs.2013.2248725 article EN IEEE Transactions on Information Forensics and Security 2013-02-25

Dual Learning for Machine Translation

OPENALEX - Publications

Di He Yingce Xia Tao Qin Liwei Wang Nenghai Yu and 2 more

While neural machine translation (NMT) is making good progress in the past two years, tens of millions bilingual sentence pairs are needed for its training. However, human labeling very costly. To tackle this training data bottleneck, we develop a dual-learning mechanism, which can enable an NMT system to automatically learn from unlabeled through game. This mechanism inspired by following observation: any task has dual task, e.g., English-to-French (primal) versus French-to-English (dual);...

10.48550/arxiv.1611.00179 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Multi-attentional Deepfake Detection

OPENALEX - Publications

Hanqing Zhao Tianyi Wei Wenbo Zhou Weiming Zhang Dongdong Chen and 1 more

Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns. Recently, how to detect such contents become a hot research topic many detection methods have been proposed. Most of them model as vanilla binary classification problem, i.e, first use backbone network extract global feature then feed it into classifier (real/fake). But since difference between real fake images in this task often subtle local, we argue solution not optimal. In paper, instead...

10.1109/cvpr46437.2021.00222 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

StyleBank: An Explicit Representation for Neural Image Style Transfer

OPENALEX - Publications

Dongdong Chen Lu Yuan Jing Liao Nenghai Yu Gang Hua

We propose StyleBank, which is composed of multiple convolution filter banks and each bank explicitly represents one style, for neural image style transfer. To transfer an to a specific the corresponding operated on top intermediate feature embedding produced by single auto-encoder. The StyleBank auto-encoder are jointly learnt, where learning conducted in such way that does not encode any information thanks flexibility introduced explicit representation. It also enables us conduct...

10.1109/cvpr.2017.296 preprint EN 2017-07-01

Reversibility improved data hiding in encrypted images

OPENALEX - Publications

Weiming Zhang Kede Ma Nenghai Yu

10.1016/j.sigpro.2013.06.023 article EN Signal Processing 2013-06-28

Online Multi-object Tracking Using CNN-Based Single Object Tracker with Spatial-Temporal Attention Mechanism

OPENALEX - Publications

Qi Chu Wanli Ouyang Hongsheng Li Xiaogang Wang Bin Liu and 1 more

In this paper, we propose a CNN-based framework for online MOT. This utilizes the merits of single object trackers in adapting appearance models and searching target next frame. Simply applying tracker MOT will encounter problem computational efficiency drifted results caused by occlusion. Our achieves sharing features using ROI-Pooling to obtain individual each target. Some learned target-specific CNN layers are used model framework, introduce spatial-temporal attention mechanism (STAM)...

10.1109/iccv.2017.518 article EN 2017-10-01

Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification

OPENALEX - Publications

Feng Zhu Hongsheng Li Wanli Ouyang Nenghai Yu Xiaogang Wang

Multi-label image classification is a fundamental but challenging task in computer vision. Great progress has been achieved by exploiting semantic relations between labels recent years. However, conventional approaches are unable to model the underlying spatial multi-label images, because annotations of generally not provided. In this paper, we propose unified deep neural network that exploits both and with only image-level supervisions. Given image, our proposed Spatial Regularization...

10.1109/cvpr.2017.219 article EN 2017-07-01

Healthchain: A Blockchain-Based Privacy Preserving Scheme for Large-Scale Health Data

OPENALEX - Publications

Jie Xu Kaiping Xue Shaohua Li Hangyu Tian Jianan Hong and 2 more

With the dramatically increasing deployment of Internet Things (IoT), remote monitoring health data to achieve intelligent healthcare has received great attention recently. However, due limited computing power and storage capacity IoT devices, users' are generally stored in a centralized third party, such as hospital database or cloud, make users lose control their data, which can easily result privacy leakage single-point bottleneck. In this paper, we propose Healthchain, large-scale...

10.1109/jiot.2019.2923525 article EN IEEE Internet of Things Journal 2019-06-18

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

OPENALEX - Publications

Honggu Liu Xiaodan Li Wenbo Zhou Yuefeng Chen Yuan He and 3 more

The remarkable success in face forgery techniques has received considerable attention computer vision due to security concerns. We observe that up-sampling is a necessary step of most techniques, and cumulative will result obvious changes the frequency domain, especially phase spectrum. According property natural images, spectrum preserves abundant components provide extra information complement loss amplitude To this end, we present novel Spatial-Phase Shallow Learning (SPSL) method, which...

10.1109/cvpr46437.2021.00083 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer

OPENALEX - Publications

Yan Lu Yue Wu Bin Liu Tianzhu Zhang Baopu Li and 2 more

Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis. Existing works mainly focus on learning modality-shared representation by embedding different modalities into same feature space, lowering the upper bound of distinctiveness. In this paper, we tackle above limitation proposing novel cross-modality shared-specific transfer algorithm (termed cm-SSFT) to explore potential both information and modality-specific characteristics...

10.1109/cvpr42600.2020.01339 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

A Privacy-Preserving Remote Data Integrity Checking Protocol with Data Dynamics and Public Verifiability

OPENALEX - Publications

Zhuo Hao Sheng Zhong Nenghai Yu

Remote data integrity checking is a crucial technology in cloud computing. Recently, many works focus on providing dynamics and/or public verifiability to this type of protocols. Existing protocols can support both features with the help third-party auditor. In previous work, Sebé et al. propose remote protocol that supports dynamics. paper, we adapt al.'s verifiability. The proposed without addition, does not leak any private information verifiers. Through formal analysis, show correctness...

10.1109/tkde.2011.62 article EN IEEE Transactions on Knowledge and Data Engineering 2011-03-15

Coherent Online Video Style Transfer

OPENALEX - Publications

Dongdong Chen Jing Liao Lu Yuan Nenghai Yu Gang Hua

Training a feed-forward network for the fast neural style transfer of images has proven successful, but naive extension processing videos frame by is prone to producing flickering results. We propose first end-to-end online video transfer, which generates temporally coherent stylized sequences in near realtime. Two key ideas include an efficient incorporating short-term coherence, and propagating coherence long-term, ensures consistency over longer period time. Our can incorporate different...

10.1109/iccv.2017.126 article EN 2017-10-01

Recursive Histogram Modification: Establishing Equivalency Between Reversible Data Hiding and Lossless Data Compression

OPENALEX - Publications

Weiming Zhang Xiaocheng Hu Xiaolong Li Nenghai Yu

State-of-the-art schemes for reversible data hiding (RDH) usually consist of two steps: first construct a host sequence with sharp histogram via prediction errors, and then embed messages by modifying the methods, such as difference expansion shift. In this paper, we focus on second stage, propose modification method RDH, which embeds message recursively utilizing decompression compression processes an entropy coder. We prove that, independent identically distributed (i.i.d.) gray-scale...

10.1109/tip.2013.2257814 article EN IEEE Transactions on Image Processing 2013-04-12

Semantics Disentangling for Text-To-Image Generation

OPENALEX - Publications

Guojun Yin Bin Liu Lu Sheng Nenghai Yu Xiaogang Wang and 1 more

Synthesizing photo-realistic images from text descriptions is a challenging problem. Previous studies have shown remarkable progresses on visual quality of the generated images. In this paper, we consider semantics input in helping render However, diverse linguistic expressions pose challenges extracting consistent even they depict same thing. To end, propose novel text-to-image generation model that implicitly disentangles to both fulfill high-level semantic consistency and low-level...

10.1109/cvpr.2019.00243 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Anonymous authentication scheme for smart home environment with provable security

OPENALEX - Publications

Mengxia Shuai Nenghai Yu Hongxia Wang Ling Xiong

10.1016/j.cose.2019.06.002 article EN Computers & Security 2019-06-12

Protecting Celebrities from DeepFake with Identity Consistency Transformer

OPENALEX - Publications

Xiaoyi Dong Jianmin Bao Dongdong Chen Ting Zhang Weiming Zhang and 4 more

In this work we propose Identity Consistency Transformer, a novel face forgery detection method that focuses on high-level semantics, specifically identity information, and detecting suspect by finding inconsistency in inner outer regions. The Transformer incorporates consistency loss for determination. We show exhibits superior generalization ability not only across different datasets but also various types of image degradation forms found real-world applications including deepfake videos....

10.1109/cvpr52688.2022.00925 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Online multi-object tracking with unsupervised re-identification learning and occlusion estimation

OPENALEX - Publications

Qiankun Liu Dongdong Chen Qi Chu Lu Yuan Bin Liu and 2 more

10.1016/j.neucom.2022.01.008 article EN Neurocomputing 2022-01-06

PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers

OPENALEX - Publications

Xiaoyi Dong Jianmin Bao Ting Zhang Dongdong Chen Weiming Zhang and 5 more

This paper explores a better prediction target for BERT pre-training of vision transformers. We observe that current targets disagree with human perception judgment. contradiction motivates us to learn perceptual target. argue perceptually similar images should stay close each other in the space. surprisingly find one simple yet effective idea: enforcing similarity during dVAE training. Moreover, we adopt self-supervised transformer model deep feature extraction and show it works well...

10.1609/aaai.v37i1.25130 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining

OPENALEX - Publications

Xiaoyi Dong Jianmin Bao Yinglin Zheng Ting Zhang Dongdong Chen and 7 more

This paper presents a simple yet effective framework MaskCLIP, which incorporates newly proposed masked self-distillation into contrastive language-image pretraining. The core idea of is to distill representation from full image the predicted image. Such incorporation enjoys two vital benefits. First, targets local patch learning, complementary vision-language focusing on text-related representation. Second, also consistent with perspective training objective as both utilize visual encoder...

10.1109/cvpr52729.2023.01058 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

F2Trans: High-Frequency Fine-Grained Transformer for Face Forgery Detection

OPENALEX - Publications

Changtao Miao Zichang Tan Qi Chu Huan Liu Honggang Hu and 1 more

In recent years, face forgery detectors have aroused great interest and achieved impressive performance, but they are still struggling with generalization robustness. this work, we explore taking full advantage of the fine-grained traces in both spatial frequency domains to alleviate issue. Specifically, propose a novel High-Frequency Fine-Grained Transformer (F2Trans) network which contains two important components, namely Central Difference Attention (CDA) High-frequency Wavelet Sampler...

10.1109/tifs.2022.3233774 article EN IEEE Transactions on Information Forensics and Security 2023-01-01

MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection

OPENALEX - Publications

Tianxiang Chen Zi Ye Zhentao Tan Tao Gong Yue Wu and 4 more

10.1109/tgrs.2024.3485721 article EN IEEE Transactions on Geoscience and Remote Sensing 2024-01-01

MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators

OPENALEX - Publications

Yaqi Zhang Di Huang Bin Liu Shixiang Tang Yan Lu and 5 more

Generating realistic human motion from given action descriptions has experienced significant advancements because of the emerging requirement digital humans. While recent works have achieved impressive results in generating directly textual descriptions, they often support only a single modality control signal, which limits their application real industry. This paper presents Motion General-Purpose generaTor (MotionGPT) that can use multimodal signals, e.g., text and single-frame poses, for...

10.1609/aaai.v38i7.28567 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Coming Soon ...