Hao Zhang

ORCID: 0000-0003-0877-2681
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Advanced Adaptive Filtering Techniques
  • Music and Audio Processing
  • Advanced Malware Detection Techniques
  • Speech Recognition and Synthesis
  • Video Coding and Compression Technologies
  • Hearing Loss and Rehabilitation
  • Network Security and Intrusion Detection
  • Acoustic Wave Phenomena Research
  • Blind Source Separation Techniques
  • Advanced Data Compression Techniques
  • Time Series Analysis and Forecasting
  • Digital and Cyber Forensics
  • Advanced Vision and Imaging
  • Recommender Systems and Techniques
  • Anomaly Detection Techniques and Applications
  • Spam and Phishing Detection
  • Digital Filter Design and Implementation
  • Structural Health Monitoring Techniques
  • Data Management and Algorithms
  • Topic Modeling
  • Underwater Acoustics Research
  • Image and Video Quality Assessment
  • Advanced Image Processing Techniques
  • Network Packet Processing and Optimization

University of Science and Technology of China
2023-2025

Bellevue Hospital Center
2023-2024

Tencent (China)
2024

PLA Information Engineering University
2024

The Fourth People's Hospital
2024

China Medical University
2024

Chinese Academy of Sciences
2023

Aerospace Information Research Institute
2023

The Ohio State University
2022

Central South University
2013-2022

Satellite image time series (SITS) classification is a challenging application concurrently driven by long-term, large-scale, and high spatial-resolution observations acquired remote sensing satellites. The focus of current SITS research to exploit the richness temporal information in data. In literature, self-attention mechanism-based networks, which are capable capturing global attention, have achieved state-of-the-art results classification. However, these methods lack attention local...

10.3390/rs15030618 article EN cc-by Remote Sensing 2023-01-20

The emerging High-Efficiency Video Coding video coding standard has shown the significant performance improvement compared to H.264/AVC with cost of huge complexity increase. Hence, HEVC fast encoding algorithms are highly demanded for real-time applications. In this paper, we propose several early termination schemes intra prediction in HEVC. More specifically, variation mode costs used terminate current unit (CU) decision as well TU size selection, where CU derived at rough phase using...

10.1109/iscas.2013.6571778 article EN 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 2013-05-01

Enhancing the expressive capacity of deep learning-based time series models with self-supervised pre-training has become ever-increasingly prevalent in classification. Even though numerous efforts have been devoted to developing for data, we argue that current methods are not sufficient learn optimal representations due solely unidirectional encoding over sparse point-wise input units. In this work, propose TimeMAE, a novel paradigm learning transferrable based on transformer networks. The...

10.48550/arxiv.2303.00320 preprint EN other-oa arXiv (Cornell University) 2023-01-01

To address challenges in screening for chronic kidney disease (CKD), we devised a deep learning-based CKD model named UWF-CKDS. It utilizes ultra-wide-field (UWF) fundus images to predict the presence of CKD. We validated with data from 23 tertiary hospitals across China. Retinal vessels and retinal microvascular parameters (RMPs) were extracted enhance interpretability, which revealed significant correlation between renal function RMPs. UWF-CKDS, utilizing UWF images, RMPs, relevant medical...

10.1038/s41746-024-01271-w article EN cc-by-nc-nd npj Digital Medicine 2024-10-07

We develop deep Poisson-gamma dynamical systems (DPGDS) to model sequentially observed multivariate count data, improving previously proposed models by not only mining hierarchical latent structure from the but also capturing both first-order and long-range temporal dependencies. Using sophisticated simple-to-implement data augmentation techniques, we derived closed-form Gibbs sampling update equations first backward upward propagating auxiliary counts, then forward downward variables....

10.48550/arxiv.1810.11209 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Recently, the Joint Video Exploration Team (JVET) has established latest video coding standard, Versatile Coding (VVC). The VVC adopts Multiple Transform Selection (MTS) as a supplement to primary transform, where DST7 and DCT8 are included optional transform kernels. Though MTS provides appreciable gain, its computational complexity is relatively high since several transforms need be evaluated through Rate Distortion Optimization (RDO) process. To address this issue, we propose novel fast...

10.1109/icme.2019.00019 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2019-07-01

Large language model evaluation plays a pivotal role in the enhancement of its capacity. Previously, numerous methods for evaluating large models have been proposed this area. Despite their effectiveness, these existing works mainly focus on assessing objective questions, overlooking capability to evaluate subjective questions which is extremely common models. Additionally, predominantly utilize centralized datasets evaluation, with question banks concentrated within platforms themselves....

10.1145/3589335.3651243 preprint EN 2024-05-12

When automatic monitoring buoys receive mixed acoustic signals from multiple underwater targets, the statistical blind source separation (BSS) task is used to separate and identify vessel features, which overly complex needs improvement, especially noting that noise cancellation stealth technologies are advancing rapidly. To fill this gap in capability, an improved non-negative matrix factorization (NMF) based BSS algorithm built on a FastICA machine learning backbone. With tool, spatial...

10.3389/fmars.2022.1097003 article EN cc-by Frontiers in Marine Science 2023-01-16

The robustness of the Kalman filter to double talk and its rapid convergence make it a popular approach for addressing acoustic echo cancellation (AEC) challenges. However, inability model nonlinearity need tune control parameters cast limitations on such adaptive filtering algorithms. In this paper, we integrate frequency domain (FDKF) deep neural networks (DNNs) into hybrid method, called NeuralKalman, leverage advantages learning Specifically, employ DNN estimate nonlinearly distorted...

10.1109/asru57964.2023.10389780 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2023-12-16

In this paper, we introduce a novel training framework designed to comprehensively address the acoustic howling issue by examining its fundamental formation process. This integrates neural network (NN) module into closed-loop system during with signals generated recursively on fly closely mimic streaming process of suppression (AHS). The proposed recursive strategy bridges gap between and real-world inference scenarios, marking departure from previous NN-based methods that typically approach...

10.1109/icassp48485.2024.10447839 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

On-device directional hearing requires audio source separation from a given direction while achieving stringent human-imperceptible latency requirements. While neural nets can achieve significantly better performance than traditional beamformers, all existing models fall short of supporting low-latency causal inference on computationally-constrained wearables. We present DeepBeam, hybrid model that combines beamformers with custom lightweight net. The former reduces the computational burden...

10.1609/aaai.v36i10.21394 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

This paper proposes a speech/music classification system based on i-vector. An analysis of two methods, namely cosine distance score (CDS) and support vector machine (SVM) is performed. Two session compensation within-class covariance normalization (WCCN) linear discriminant (LDA) are also discussed. The performance proposed systems yields better results compared with Gaussian mixture model (GMM) method modified low energy ratio (MLER) method.

10.1109/isspit.2016.7885999 article EN 2016-12-01

In this paper, we formulate acoustic howling suppression (AHS) as a supervised learning problem and propose deep approach, called Deep AHS, to address it. AHS is trained in teacher forcing way which converts the recurrent process into an instantaneous speech separation simplify accelerate model training. The proposed method utilizes properly designed features trains attention based neural network (RNN) extract target signal from microphone recording, thus attenuating playback that may lead...

10.1109/icassp49357.2023.10095032 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

VSYNC is a novel incremental video file synchronization system that efficiently synchronizes two files at remote ends through bi-directional communications link. Retransmission of has been modified only slightly, for the purpose with remote-end copy, extremely expensive but avoidable. algorithm designed to automatically detect and transmit changes in without knowledge what was changed. Another feature it allows within some user defined distortion constraint. A hierarchical hashing scheme...

10.1145/1459359.1459479 article EN Proceedings of the 30th ACM International Conference on Multimedia 2008-10-26

In this paper, we propose a neural cascade architecture for joint acoustic echo and noise suppression. The proposed consists of two modules. A convolutional recurrent network (CRN) is employed in the first module complex spectral mapping. output then fed as an additional input to second module, where long short-term memory (LSTM) utilized magnitude mask estimation. entire trained end-to-end manner with modules optimized jointly using single loss function. final generated enhanced phase...

10.1109/icassp43922.2022.9747445 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Insecure data storage may open a door to malicious malware steal users' and system sensitive information. These problems due developer negligence or lack of security knowledge. Android developers use various methods store data. However, Attackers have attacked these vulnerable storage. Although the modified apps after knowing vulnerability, user's personal information has been leaked caused serious consequences. As result, instead patching fixing we should conduct proactive control for...

10.1109/compsac.2019.00143 article EN 2019-07-01

Reversing the syntactic format of program inputs and data structures in binaries plays a vital role for understanding behaviors many security applications. In this paper, we propose collaborative reversing technique by capturing mapping relationship between input fields structures. The key insight behind our paper is that uses corresponding as references to parse access different fields, every field could be identified its structure. details, use finegrained dynamic taint analysis monitor...

10.1109/cc.2014.6969778 article EN China Communications 2014-09-01

A novel tunable comb filter composed of a single-mode/multimode/polarization-maintaining-fiber-based Sagnac fiber loop is proposed and experimentally demonstrated. The tunability achieved by rotating the polarization controller. spectral shift dependent on rotation direction position In addition, adjustable range half-wave-plate controller twice higher than that quarter-wave-plate one.

10.1088/1674-1056/22/6/064216 article EN Chinese Physics B 2013-06-01
Coming Soon ...