Shixiong Zhang

ORCID: 0000-0002-0314-9199
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Speech Recognition and Synthesis
  • Music and Audio Processing
  • Advanced Adaptive Filtering Techniques
  • Single-cell and spatial transcriptomics
  • Gene expression and cancer classification
  • RNA and protein synthesis mechanisms
  • RNA Research and Splicing
  • Hearing Loss and Rehabilitation
  • Cancer-related molecular mechanisms research
  • Genomics and Chromatin Dynamics
  • Indoor and Outdoor Localization Technologies
  • MicroRNA in disease regulation
  • Complex Systems and Time Series Analysis
  • Stock Market Forecasting Methods
  • Ultrasonics and Acoustic Wave Propagation
  • Cancer Genomics and Diagnostics
  • Advanced biosensing and bioanalysis techniques
  • Video Surveillance and Tracking Methods
  • Metaheuristic Optimization Algorithms Research
  • CRISPR and Genetic Engineering
  • Advanced Vision and Imaging
  • Gut microbiota and health
  • Epigenetics and DNA Methylation
  • Multi-Criteria Decision Making

Hebei University of Chinese Medicine
2025

Nanjing University of Chinese Medicine
2022-2025

Bellevue Hospital Center
2019-2024

Macau University of Science and Technology
2022-2024

Sun Yat-sen University
2022-2024

City University of Hong Kong
2018-2023

Xidian University
2021-2023

University of Electronic Science and Technology of China
2023

Peking University
2021

KLA (United States)
2021

Speech enhancement and speech separation are two related tasks, whose purpose is to extract either one or more target signals, respectively, from a mixture of sounds generated by several sources. Traditionally, these tasks have been tackled using signal processing machine learning techniques applied the available acoustic signals. Since visual aspect essentially unaffected environment, information speakers, such as lip movements facial expressions, has also used for systems. In order...

10.1109/taslp.2021.3066303 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Audio-visual multi-modal modeling has been demonstrated to be effective in many speech related tasks, such as recognition and enhancement. This paper introduces a new time-domain audio-visual architecture for target speaker extraction from monaural mixtures. The generalizes the previous TasNet (time-domain separation network) enable learning at meanwhile it extends classical frequency-domain time-domain. main components of proposed include an audio encoder, video encoder that extracts lip...

10.1109/asru46091.2019.9003983 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019-12-01

Target speech separation refers to extracting a target speaker's voice from an overlapped audio of simultaneous talkers. Previously the use visual modality for has demonstrated great potentials. This work proposes general multi-modal framework by utilizing all available information speaker, including his/her spatial location, characteristics and lip movements. Also, under this framework, we investigate on fusion methods joint modeling. A factorized attention-based method is proposed...

10.1109/jstsp.2020.2980956 article EN IEEE Journal of Selected Topics in Signal Processing 2020-03-01

Speech separation algorithms are often used to separate the target speech from other interfering sources. However, purely neural network based systems cause nonlinear distortion that is harmful for automatic recognition (ASR) systems. The conventional mask-based minimum variance distortionless response (MVDR) beamformer can be minimize distortion, but comes with high level of residual noise. Furthermore, matrix operations (e.g., inversion) involved in MVDR solution sometimes numerically...

10.1109/icassp39728.2021.9413594 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

The end-to-end approach for single-channel speech separation has been studied recently and shown promising results. This paper extended the previous proposed a new model multi-channel separation. primary contributions of this work include 1) an integrated waveform-in waveform-out system in single neural network architecture. 2) We reformulate traditional short time Fourier transform (STFT) inter-channel phase difference (IPD) as function time-domain convolution with special kernel. 3)...

10.48550/arxiv.1905.06286 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Abstract Single-cell RNA sequencing provides high-throughput gene expression information to explore cellular heterogeneity at the individual cell level. A major challenge in characterizing data arises from challenges related dimensionality, and prevalence of dropout events. To address these concerns, we develop a deep graph learning method, scMGCA, for single-cell analysis. scMGCA is based on graph-embedding autoencoder that simultaneously learns cell-cell topology representation cluster...

10.1038/s41467-023-36134-7 article EN cc-by Nature Communications 2023-01-25

Bazi Bushen (BZBS), a traditional Chinese medicine, has been proven effective in the treatment of age-related disease mouse models. However, whether its therapeutic effects are due to antiaging mechanism not yet explored. In present study, we investigated BZBS naturally aging mice by using behavioral tests, liver DNA methylome sequencing, methylation age estimation, and frailty index assessment. The analysis revealed decrease mCpG levels aged liver. tended restore age-associated decline...

10.1016/j.biopha.2023.114384 article EN cc-by-nc-nd Biomedicine & Pharmacotherapy 2023-02-09

Purpose The senescence-accelerated prone mouse 8 (SAMP8) is a widely used model for accelerating aging, especially in central aging. Mounting evidence indicates that the microbiota-gut-brain axis may be involved pathogenesis and progression of aging-related diseases. This study aims to investigate whether Bazi Bushen capsule (BZBS) attenuates deterioration intestinal function aging animal model. Methods In our study, SAMP8 mice were randomly divided into group, BZ-low group (0.5 g/kg/d...

10.3389/fmicb.2023.1320202 article EN cc-by Frontiers in Microbiology 2024-01-08

Abstract The off‐target effects induced by guide RNAs in the CRISPR/Cas9 gene‐editing system have raised substantial concerns recent years. Many silico predictive models been developed for predicting activities; however, few are capable of activities with insertions or deletions between RNA and target DNA sequence pair. In order to fill this gap, a recurrent convolutional network named CRISPR‐Net is scoring gRNA‐target pairs mismatches indels; machine‐learning based model...

10.1002/advs.201903562 article EN cc-by Advanced Science 2020-05-20

To date, mainstream target speech separation (TSS) approaches are formulated to estimate the complex ratio mask (cRM) of in time-frequency domain under supervised deep learning framework.However, existing models for estimating cRM designed way that real and imaginary parts separately modeled using real-valued training data pairs.The research motivation this study is design a model fully exploits temporal-spectral-spatial information multi-channel signals directly efficiently domain.As...

10.1109/lsp.2021.3076374 article EN IEEE Signal Processing Letters 2021-01-01

Although the conventional mask-based minimum variance distortionless response (MVDR) could reduce non-linear distortion, residual noise level of MVDR separated speech is still high.In this paper, we propose a spatio-temporal recurrent neural network based beamformer (RNN-BF) for target separation.This new beamforming framework directly learns weights from estimated and spatial covariance matrices.Leveraging on temporal modeling capability RNNs, RNN-BF automatically accumulate statistics...

10.21437/interspeech.2021-430 article EN Interspeech 2022 2021-08-27

Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation. In parallel, the integration of time network structure and also gains significant attention. This study proposes a novel method in makes an attempt to unify pipelines The proposed model consists two modules: separation beamforming. Both modules perform temporal-spectral-spatial modeling are trained from end-to-end using joint loss function. novelty this lies folds....

10.1109/taslp.2022.3229261 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2022-12-14

We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating responses (RIRs) given acoustic environment. Our FAST-RIR takes rectangular dimensions, listener and speaker positions, reverberation time (T <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">60</inf> ) as inputs generates specular reflections is capable of RIRs input T with an average error 0.02s. evaluate our generated in automatic speech...

10.1109/icassp43922.2022.9747846 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Abstract Motivation The RNA-guided CRISPR/Cas9 system has been widely applied to genome editing. can effectively edit the on-target genes. Nonetheless, it recently demonstrated that many homologous off-target genomic sequences could be mutated, leading unexpected gene-editing outcomes. Therefore, a plethora of tools were proposed for prediction activities CRISPR/Cas9. each computational tool its own advantages and drawbacks under diverse conditions. It is hardly believed single optimal all...

10.1093/bioinformatics/bty748 article EN Bioinformatics 2018-08-30

Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful modern automatic recognition (ASR) systems. Minimum variance distortionless response (MVDR) filters adopted remove distortions, however, conventional mask-based MVDR systems still result in relatively high levels of residual noise. Moreover, the matrix inverse involved solution is sometimes numerically...

10.1109/taslp.2021.3129335 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Context Bazi Bushen capsule (BZBS) has anti-ageing properties and is effective in enhancing memory.Objective To find evidence supporting the mechanisms biomarkers by which BZBS functions.Materials methods Male C57BL/6J mice were randomly divided into five groups: normal, ageing, β-nicotinamide mononucleotide (NMN), low-dose (LD-BZ) high-dose (HD-BZ). The last four groups subcutaneously injected with d-galactose (d-gal, 100 mg/kg/d) to induce ageing process. At same time, LD-BZ, HD-BZ NMN...

10.1080/13880209.2022.2131839 article EN cc-by-nc Pharmaceutical Biology 2022-10-20

Audio-visual learning helps to comprehensively under-stand the world by fusing practical information from multiple modalities. However, recent studies show that imbalanced optimization of uni-modal encoders in a joint-learning model is bottleneck enhancing model's performance. We further find up-to-date imbalance-mitigating methods fail on some audio-visual fine-grained tasks, which have higher demand for distinguishable feature distribution. Fueled success cosine loss builds hyperspherical...

10.1109/icassp49357.2023.10096655 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

The cloud-based speech recognition/API provides developers or enterprises an easy way to create speech-enabled features in their applications. However, sending audios about personal company internal information the cloud, raises concerns privacy and security issues. recognition results generated cloud may also reveal some sensitive information. This paper proposes a deep polynomial network (DPN) that can be applied encrypted as acoustic model. It allows clients send data form ensure remains...

10.1109/icassp.2019.8683721 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

In recent years, the detection of epistatic interactions multiple genetic variants on causes complex diseases brings a significant challenge in genome-wide association studies (GWAS). However, most existing methods still suffer from algorithmic limitations such as single-objective optimization, intensive computational requirement, and premature convergence. this paper, we propose formulate an interaction multi-objective artificial bee colony algorithm based decomposition (EIMOABC/D) to...

10.1109/tcbb.2018.2849759 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2018-06-22

Abstract Motivation In recent years, single-cell RNA sequencing enables us to discover cell types or even subtypes. Its increasing availability provides opportunities identify populations from RNA-seq data. Computational methods have been employed reveal the gene expression variations among multiple populations. Unfortunately, existing ones can suffer realistic restrictions such as experimental noises, numerical instability, high dimensionality and computational scalability. Results We...

10.1093/bioinformatics/bty1056 article EN Bioinformatics 2018-12-21
Coming Soon ...