Zhaoyi Liu

ORCID: 0000-0003-0697-9080
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Music and Audio Processing
  • Speech and Audio Processing
  • Speech Recognition and Synthesis
  • Video Coding and Compression Technologies
  • Anomaly Detection Techniques and Applications
  • Advanced Data Compression Techniques
  • Water Systems and Optimization
  • Image and Signal Denoising Methods
  • Advanced Vision and Imaging
  • Advanced Image Processing Techniques
  • Image and Video Quality Assessment
  • Image Enhancement Techniques
  • Advanced Adaptive Filtering Techniques
  • Video Analysis and Summarization
  • Cloud Computing and Resource Management
  • Topic Modeling
  • Software-Defined Networks and 5G
  • IoT and Edge/Fog Computing
  • Machine Fault Diagnosis Techniques
  • Multimodal Machine Learning Applications
  • Software Engineering Research
  • Network Security and Intrusion Detection
  • Integrated Circuits and Semiconductor Failure Analysis
  • Human Motion and Animation
  • Computer Graphics and Visualization Techniques

KU Leuven
2022-2025

University of California, San Diego
2025

IMEC
2022-2024

Shanghai University of Engineering Science
2022

Wuhan University of Science and Technology
2021

Beijing Institute of Technology
2016-2019

Peking University Shenzhen Hospital
2019

Peking University
2018

University of Electronic Science and Technology of China
2016-2017

Microsoft (United States)
2015

Can we get network latency between any two servers at time in large-scale data center networks? The collected can then be used to address a series of challenges: telling if an application perceived issue is caused by the or not, defining and tracking service level agreement (SLA), automatic troubleshooting. We have developed Pingmesh system for measurement analysis answer above question affirmatively. has been running Microsoft centers more than four years, it collects tens terabytes per...

10.1145/2785956.2787496 article EN 2015-08-17

Can we get network latency between any two servers at time in large-scale data center networks? The collected can then be used to address a series of challenges: telling if an application perceived issue is caused by the or not, defining and tracking service level agreement (SLA), automatic troubleshooting. We have developed Pingmesh system for measurement analysis answer above question affirmatively. has been running Microsoft centers more than four years, it collects tens terabytes per...

10.1145/2829988.2787496 article EN ACM SIGCOMM Computer Communication Review 2015-08-17

Abstract Tonic-clonic seizures (TCSs), which present a significant risk for sudden unexpected death in epilepsy (SUDEP), require accurate detection to enable effective long-term monitoring. Previous studies have demonstrated the advantages of multimodal seizure systems reliably detecting TCSs over extended periods. However, effectiveness these data-driven depends heavily on availability reliable training data. To address this need, we propose an innovative data selection method designed...

10.1088/1741-2552/adbec0 article EN Journal of Neural Engineering 2025-03-10

This paper presents a novel convolutional neural network (CNN) based image compression framework via scalable auto-encoder (SAE). Specifically, our SAE deep codec consists of hierarchical coding layers, each which is an end-to-end optimized auto-encoder. The coarse content and texture are encoded through the first (base) layer while consecutive (enhance) layers iteratively code pixel-level reconstruction errors between original former reconstructed images. proposed structure alleviates need...

10.1109/mipr.2019.00087 article EN 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) 2019-03-01

10.1016/j.jvcir.2016.03.025 article EN Journal of Visual Communication and Image Representation 2016-03-31

We proposed a novel chroma upsampling scheme for YCbCr 420 videos. In state-of-the-art work, the is performed by simply duplicating stored U and V values in 4 copies saving to corresponding locations 444. method, instead of directly copying current values, we consider their neighboring as weights redistribute total energy. The experimental results show that method better than state-of the-art all tested videos terms CPSNR; average gain 0.13 dB.

10.1109/icce-china.2017.7991046 article EN 2017-06-01

Enhancing speech captured by distant microphones is a challenging task. In this study, we investigate the multichannel signal properties of single acoustic vector sensor (AVS) to obtain inter-sensor data ratio (ISDR) model in time-frequency (TF) domain. Then, monotone functions describing relationship between ISDRs and direction arrival (DOA) target speaker are derived. For enhancement (SE) task, DOA given, calculated. Hence, TF components dominated extracted with high probability using...

10.3390/app8091436 article EN cc-by Applied Sciences 2018-08-23

Sequential audio event tagging can provide not only the type information of events, but also order between events and number that occur in an clip.Most previous works on sequence analysis rely connectionist temporal classification (CTC).However, CTC's conditional independence assumption prevents it from effectively learning correlations diverse events.This paper first introduces Transformer into sequential tagging, since Transformers perform well sequence-related tasks.To better utilize...

10.21437/interspeech.2022-196 article EN Interspeech 2022 2022-09-16

H.264 coding technology is the most commonly-used video compression technology. As videos in transmission, packet losses inevitably occur. The error concealment method can solve problem. For lost block, we use motion vector of neighboring available block to estimate vectors, and estimates propagate predict all other missing vectors. By comparison against state-of-the-art method, proposed algorithm increases average PSNR by 1.93 dB on average.

10.1109/icasi.2017.7988151 article EN 2017-05-01

This paper proposes a method for Acoustic Constrained Segmentation (ACS) in audio recordings of vehicles driven through production test track, delimiting the boundaries surface types track. ACS is variant classical acoustic segmentation where sequence labels known, contiguous and invariable, which especially useful this work as track has standard configuration types. The proposed ConvDTW-ACS utilizes Convolutional Neural Network classifying overlapping image chunks extracted from full...

10.48550/arxiv.2402.18204 preprint EN arXiv (Cornell University) 2024-02-28

As generative models achieve great success, tampering and modifying the sensitive image contents (i.e., human faces, artist signatures, commercial logos, etc.) have induced a significant threat with social impact. The backdoor attack is method that implants vulnerabilities in target model, which can be activated through trigger. In this work, we innovatively prevent abuse of content modification by implanting into image-editing models. Once protected on an modified editing will triggered,...

10.48550/arxiv.2410.14966 preprint EN arXiv (Cornell University) 2024-10-18

Many real-world user queries (e.g. "How do to make egg fried rice?") could benefit from systems capable of generating responses with both textual steps accompanying images, similar a cookbook. Models designed generate interleaved text and images face challenges in ensuring consistency within across these modalities. To address challenges, we present ISG, comprehensive evaluation framework for text-and-image generation. ISG leverages scene graph structure capture relationships between image...

10.48550/arxiv.2411.17188 preprint EN arXiv (Cornell University) 2024-11-26

Pixel-wise regression tasks (e.g., monocular depth estimation (MDE) and optical flow (OFE)) have been widely involved in our daily life applications like autonomous driving, augmented reality video composition. Although certain are security-critical or bear societal significance, the adversarial robustness of such models not sufficiently studied, especially black-box scenario. In this work, we introduce first unified patch attack framework against pixel-wise tasks, aiming to identify...

10.48550/arxiv.2404.00924 preprint EN arXiv (Cornell University) 2024-04-01

Anomaly detection models can help to automatically and proactively detect faults in industrial machines. Microphones are appealing as they generally inexpensive unlike visual inspection, recording sound samples give information about the internals of machine. However, conventional methods based on an AutoEncoder (AE) structure learned from scratch struggle learn how robustly reconstruct with limited available data. This paper addresses this problem by presenting a method for unsupervised...

10.23919/apsipaasc55919.2022.9980266 article EN 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2022-11-07
Coming Soon ...