Yuesheng Zhu

ORCID: 0000-0003-2524-6800
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Video Surveillance and Tracking Methods
  • Advanced Image and Video Retrieval Techniques
  • Advanced Vision and Imaging
  • Advanced Steganography and Watermarking Techniques
  • Digital Media Forensic Detection
  • Advanced Image Processing Techniques
  • Face and Expression Recognition
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Generative Adversarial Networks and Image Synthesis
  • Face recognition and analysis
  • Image Retrieval and Classification Techniques
  • Domain Adaptation and Few-Shot Learning
  • Anomaly Detection Techniques and Applications
  • Video Coding and Compression Technologies
  • Advanced Neural Network Applications
  • Chaos-based Image/Signal Encryption
  • Video Analysis and Summarization
  • Advanced MIMO Systems Optimization
  • Advanced Wireless Communication Techniques
  • Biometric Identification and Security
  • Speech and Audio Processing
  • Sparse and Compressive Sensing Techniques
  • Cooperative Communication and Network Coding
  • Handwritten Text Recognition Techniques

Peking University
2016-2025

Peking University Shenzhen Hospital
2010-2025

Second Affiliated Hospital & Yuying Children's Hospital of Wenzhou Medical University
2021

Wenzhou Medical University
2021

Beijing Institute of Big Data Research
2017-2021

University of Chile
2019

Peking University Third Hospital
2016

Jilin Business and Technology College
2011

City University of Hong Kong
2010-2011

Jilin Province Science and Technology Department
2011

As a vital copyright protection technology, blind watermarking based on deep learning with an end-to-end encoder-decoder architecture has been recently proposed. Although the one-stage training (OET) facilitates joint of encoder and decoder, noise attack must be simulated in differentiable way, which is not always applicable practice. In addition, OET often encounters problems converging slowly tends to degrade quality watermarked images under attack. order address above improve...

10.1145/3343031.3351025 article EN Proceedings of the 30th ACM International Conference on Multimedia 2019-10-15

Joint iris-periocular recognition based on feature fusion can overcome some inherent drawbacks of unimodal biometrics, but most the prior works are limited by conventional extraction approaches and fixed schemes. To achieve more accurate adaptive recognition, an end-to-end deep network for joint is proposed in this paper. Multiple attention mechanisms including self-attention co-attention integrated into network. Specifically, two forms mechanisms, spatial channel attention, inserted module,...

10.1109/lsp.2021.3079850 article EN IEEE Signal Processing Letters 2021-01-01

Spiking Neural Networks (SNNs), known for their biologically plausible architecture, face the challenge of limited performance. The self-attention mechanism, which is cornerstone high-performance Transformer and also a inspired structure, absent in existing SNNs. To this end, we explore potential leveraging both capability biological properties SNNs, propose novel Self-Attention (SSA) (Spikformer). SSA mechanism eliminates need softmax captures sparse visual feature employing spike-based...

10.48550/arxiv.2401.02020 preprint EN other-oa arXiv (Cornell University) 2024-01-01

The depth image based rendering (DIBR) plays a key role in 3D video synthesis, by which other virtual views can be generated from 2D and its map. However, the synthesis process, background occluded foreground objects might exposed new view, resulting some holes synthetized video. In this paper, hole filling approach on reconstruction is proposed, temporal correlation information both corresponding map are exploited to construct To clean video, detected removed. Also motion compensation...

10.1109/cvpr.2016.197 article EN 2016-06-01

This paper proposes a disocclusion inpainting framework for depth-based view synthesis. It consists of four modules: foreground extraction, motion compensation, improved background reconstruction, and inpainting. The extraction module detects the objects removes them from both depth map rendered video; compensation guarantees reconstruction model to suit moving camera scenarios; constructs stable video by exploiting temporal correlation information in 2D its corresponding map; constructed...

10.1109/tpami.2019.2899837 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-02-15

The increase in the source and size of encrypted network traffic brings significant challenges for analysis. challenging problem classification field is obtaining high accuracy with small number labeled samples. To solve this problem, we propose a novel encryption method that learns feature representation from structure flow data paper. We construct K-Nearest Neighbor (KNN) graph to represent data, which contains more similarity information about traffic. utilize two-layer Graph...

10.1109/ipccc50635.2020.9391542 article EN 2020-11-06

Detecting polyps through colonoscopy is an important task in medical image segmentation, which provides significant assistance and reference value for clinical surgery. However, accurate segmentation of a challenging due to two main reasons. Firstly, exhibit various shapes colors. Secondly, the boundaries between their normal surroundings are often unclear. Additionally, differences different datasets lead limited generalization capabilities existing methods. To address these issues, we...

10.48550/arxiv.2403.13660 preprint EN arXiv (Cornell University) 2024-03-20

Depth-Image-Based Rendering (DIBR) is widely used to generate virtual view of a scene from known with associated depth map in 3D video applications. However, disocclusion arises image warping DIBR. Many hole-filling methods have been proposed such as constant color, horizontal interpolation, extrapolation, and variational inpainting, but they cause different types annoying artifact for large holes complex texture background. In this paper, novel multidirectional extrapolation method enhance...

10.1109/icip.2011.6116194 article EN 2011-09-01

Mathematical expression recognition (MER) in images is a challenging task due to formula symbol and structured analysis. Optical character (OCR) has been used natural language many areas. However, it difficult for OCR recognize some special symbols accurately confirm their positions MER. In this paper, an improved end-to-end MER approach based on CNN-RNNs (convolutional neural network - recurrent networks) proposed optimize the processing localization. our approach, we extract mathematical...

10.1145/3330393.3330410 article EN 2019-05-10

With the rapid development of blockchain technology, different types blockchains are adopted and interoperability across has received widespread attention. There have been many cross-chain solutions proposed in recent years, including notary scheme, sidechain, relay chain. However, most existing platforms do not take confidentiality into account, although privacy become an important concern for blockchain. In this paper, we present TrustCross, a privacy- preserving platform to enable...

10.1145/3510487.3510491 preprint EN 2021-12-17

Abstract Internal learning‐based video inpainting methods have shown promising results by exploiting the intrinsic properties of to fill in missing region without external dataset supervision. However, existing internal would produce inconsistent structures or blurry textures due insufficient utilisation motion priors within sequence. In this paper, authors propose a new model called appearance consistency and coherence network (ACMC‐Net), which can not only learn recurrence prior but also...

10.1049/cit2.12405 article EN cc-by CAAI Transactions on Intelligence Technology 2025-02-07

10.1109/icassp49660.2025.10890026 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Learning from multi-center medical datasets to obtain a high-performance global model is challenging due the privacy protection and data heterogeneity in healthcare systems. Current federated learning approaches are not efficient enough learn Non-Independent Identically Distributed (Non-IID) require high communication costs. In this work, practical computing framework proposed train Non-IID image segmentation under various setting low cost. Specifically, an cascaded diffusion trained...

10.1109/jbhi.2025.3549029 article EN IEEE Journal of Biomedical and Health Informatics 2025-01-01

Recently, numerous robust image hashing schemes have been developed for content identification. However, many of these face the challenges maintaining discrimination while simultaneously resisting large-scale attacks. In this paper, we propose a scheme based on Contrastive Masked Autoencoder with weak-strong augmentation Alignment (CMAA). Leveraging contrastive learning, CMAA is designed to learn features that are and hybrid attacks those features. Specifically, it utilizes distribution...

10.1609/aaai.v39i9.32991 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

The depth-image-based rendering is a key technique for 3D video and free viewpoint synthesis. One of the critical problems in current synthesis methods that background (BG) occluded by foreground objects might be exposed new view, some holes are produced synthesized video. However, most traditional hole-filling approaches may bring blurry effect or artifacts virtual view. In this paper, removal approach hole filling proposed, which removed from both 2D its corresponding depth map, then BG...

10.1109/tcsvt.2016.2583978 article EN IEEE Transactions on Circuits and Systems for Video Technology 2016-06-23

Semantic searching over encrypted data is a crucial task for secure information retrieval in public cloud. It aims to provide service arbitrary words so that queries and search results are flexible. In existing semantic schemes, the verifiable does not be supported since it dependent on forecasted from predefined keywords verify cloud, expanded plaintext exact matching performed by extended semantically with keywords, which limits their accuracy. this paper, we propose scheme. For optimal...

10.1109/tifs.2020.3001728 article EN IEEE Transactions on Information Forensics and Security 2020-06-11

Video anomaly detection is a challenging task due to the diversity of anomaly. Existing GAN-based approaches model normal motion pattern through transforming single image optical flow map, which tends learn mapping between two adjacent frames instead evolution in scenes. Therefore, this paper proposes Temporal enhanced Appearance-to-Motion generative Network (TAM-Net) appearance and for events. In branch, corresponding map generated by ConvLSTM-based adversarial network from consecutive...

10.1109/ijcnn48605.2020.9207231 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2020-07-01

Embedding watermark into H.264/AVC streams directly can reduce computational complexity compared to encoder-based watermarking algorithms. However, it would cause intra error propagation and decrease the video quality. To improve quality complexity, an improved compensation scheme is developed in this paper. In scheme, only some of integer transform (IT) coefficients are processed as opposed process all IT-coefficients seen other methods. This approach helps computation scheme. The...

10.1109/lsp.2011.2162061 article EN IEEE Signal Processing Letters 2011-07-21

Route planning is a key technology for an unmanned aerial vehicle (UAV) to fly reliably and safely in the presence of threat environment. Existing route methods are mainly based on simulation scene, whereas approaches virtual globe platform have rarely been reported. In this paper, new space planner proposed common model constructed threats including no-fly zone, hazardous weather, radar coverage area, missile killing zone dynamic threats. Additionally, improved ant colony optimization (ACO)...

10.3390/ijgi5100184 article EN cc-by ISPRS International Journal of Geo-Information 2016-10-10

Video summarization is not only the key to effective cataloging and browsing video, but also as an embedded cue trace video object activities. In this paper, a approach based on machine learning developed for automatic transition prediction. Several novel features are extracted characterize boundary, including cut, fade in, out dissolve facilitating understanding content structure domain rules of video. These can be used filter negative false alarms caused by illumination changes improve...

10.1109/iih-msp.2008.296 article EN 2008-08-01

We propose a robust digital watermarking algorithm for copyright protection. A stable feature is obtained by utilizing QR factorization and discrete cosine transform (DCT) techniques, meaningful watermark image embedded into an modifying the with quantization index modulation (QIM) method. The combination of factorization, DCT, QIM techniques guarantees robustness algorithm. Furthermore, embedding location selection method exploited to select blocks small modifications as locations. This can...

10.1631/jzus.c1100338 article EN Journal of Zhejiang University SCIENCE C 2012-08-01
Coming Soon ...