NFDI4DS | UHH-SEMS - Publication Details

Shixiong Zhang

ORCID: 0000-0002-0314-9199

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5056567731

Research Areas

Speech and Audio Processing
Speech Recognition and Synthesis
Music and Audio Processing
Advanced Adaptive Filtering Techniques
Single-cell and spatial transcriptomics
Gene expression and cancer classification
RNA and protein synthesis mechanisms
RNA Research and Splicing
Hearing Loss and Rehabilitation
Cancer-related molecular mechanisms research
Genomics and Chromatin Dynamics
Indoor and Outdoor Localization Technologies
MicroRNA in disease regulation
Complex Systems and Time Series Analysis
Stock Market Forecasting Methods
Ultrasonics and Acoustic Wave Propagation
Cancer Genomics and Diagnostics
Advanced biosensing and bioanalysis techniques
Video Surveillance and Tracking Methods
Metaheuristic Optimization Algorithms Research
CRISPR and Genetic Engineering
Advanced Vision and Imaging
Gut microbiota and health
Epigenetics and DNA Methylation
Multi-Criteria Decision Making

Hebei University of Chinese Medicine
2025

Nanjing University of Chinese Medicine
2022-2025

Bellevue Hospital Center
2019-2024

Macau University of Science and Technology
2022-2024

Sun Yat-sen University
2022-2024

City University of Hong Kong
2018-2023

Xidian University
2021-2023

University of Electronic Science and Technology of China
2023

Peking University
2021

KLA (United States)
2021

An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

OPENALEX - Publications

Daniel Michelsanti Zheng‐Hua Tan Shixiong Zhang Yong Xu Yu Meng and 2 more

Speech enhancement and speech separation are two related tasks, whose purpose is to extract either one or more target signals, respectively, from a mixture of sounds generated by several sources. Traditionally, these tasks have been tackled using signal processing machine learning techniques applied the available acoustic signals. Since visual aspect essentially unaffected environment, information speakers, such as lip movements facial expressions, has also used for systems. In order...

10.1109/taslp.2021.3066303 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Time Domain Audio Visual Speech Separation

OPENALEX - Publications

Jian Wu Yong Xu Shixiong Zhang Lianwu Chen Meng Yu and 2 more

Audio-visual multi-modal modeling has been demonstrated to be effective in many speech related tasks, such as recognition and enhancement. This paper introduces a new time-domain audio-visual architecture for target speaker extraction from monaural mixtures. The generalizes the previous TasNet (time-domain separation network) enable learning at meanwhile it extends classical frequency-domain time-domain. main components of proposed include an audio encoder, video encoder that extracts lip...

10.1109/asru46091.2019.9003983 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019-12-01

Multi-Modal Multi-Channel Target Speech Separation

OPENALEX - Publications

Rongzhi Gu Shixiong Zhang Yong Xu Lianwu Chen Yuexian Zou and 1 more

Target speech separation refers to extracting a target speaker's voice from an overlapped audio of simultaneous talkers. Previously the use visual modality for has demonstrated great potentials. This work proposes general multi-modal framework by utilizing all available information speaker, including his/her spatial location, characteristics and lip movements. Also, under this framework, we investigate on fusion methods joint modeling. A factorized attention-based method is proposed...

10.1109/jstsp.2020.2980956 article EN IEEE Journal of Selected Topics in Signal Processing 2020-03-01

ADL-MVDR: All Deep Learning MVDR Beamformer for Target Speech Separation

OPENALEX - Publications

Zhuohuang Zhang Yong Xu Meng Yu Shixiong Zhang Lianwu Chen and 1 more

Speech separation algorithms are often used to separate the target speech from other interfering sources. However, purely neural network based systems cause nonlinear distortion that is harmful for automatic recognition (ASR) systems. The conventional mask-based minimum variance distortionless response (MVDR) beamformer can be minimize distortion, but comes with high level of residual noise. Furthermore, matrix operations (e.g., inversion) involved in MVDR solution sometimes numerically...

10.1109/icassp39728.2021.9413594 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

End-to-End Multi-Channel Speech Separation

OPENALEX - Publications

Rongzhi Gu Jian Wu Shixiong Zhang Lianwu Chen Yong Xu and 4 more

The end-to-end approach for single-channel speech separation has been studied recently and shown promising results. This paper extended the previous proposed a new model multi-channel separation. primary contributions of this work include 1) an integrated waveform-in waveform-out system in single neural network architecture. 2) We reformulate traditional short time Fourier transform (STFT) inter-channel phase difference (IPD) as function time-domain convolution with special kernel. 3)...

10.48550/arxiv.1905.06286 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA

OPENALEX - Publications

Zhuohan Yu Yanchi Su Yifu Lu Yuning Yang Fuzhou Wang and 4 more

Abstract Single-cell RNA sequencing provides high-throughput gene expression information to explore cellular heterogeneity at the individual cell level. A major challenge in characterizing data arises from challenges related dimensionality, and prevalence of dropout events. To address these concerns, we develop a deep graph learning method, scMGCA, for single-cell analysis. scMGCA is based on graph-embedding autoencoder that simultaneously learns cell-cell topology representation cluster...

10.1038/s41467-023-36134-7 article EN cc-by Nature Communications 2023-01-25

Bazi Bushen mitigates epigenetic aging and extends healthspan in naturally aging mice

OPENALEX - Publications

Xinjing Mao Yunlong Hou Chao Fang Kun Ma Shixiong Zhang and 13 more

Bazi Bushen (BZBS), a traditional Chinese medicine, has been proven effective in the treatment of age-related disease mouse models. However, whether its therapeutic effects are due to antiaging mechanism not yet explored. In present study, we investigated BZBS naturally aging mice by using behavioral tests, liver DNA methylome sequencing, methylation age estimation, and frailty index assessment. The analysis revealed decrease mCpG levels aged liver. tended restore age-associated decline...

10.1016/j.biopha.2023.114384 article EN cc-by-nc-nd Biomedicine & Pharmacotherapy 2023-02-09

Bazi Bushen capsule improves the deterioration of the intestinal barrier function by inhibiting NLRP3 inflammasome-mediated pyroptosis through microbiota-gut-brain axis

OPENALEX - Publications

Shixiong Zhang Mengnan Li Liping Chang Xinjing Mao Yuning Jiang and 10 more

Purpose The senescence-accelerated prone mouse 8 (SAMP8) is a widely used model for accelerating aging, especially in central aging. Mounting evidence indicates that the microbiota-gut-brain axis may be involved pathogenesis and progression of aging-related diseases. This study aims to investigate whether Bazi Bushen capsule (BZBS) attenuates deterioration intestinal function aging animal model. Methods In our study, SAMP8 mice were randomly divided into group, BZ-low group (0.5 g/kg/d...

10.3389/fmicb.2023.1320202 article EN cc-by Frontiers in Microbiology 2024-01-08

CRISPR‐Net: A Recurrent Convolutional Network Quantifies CRISPR Off‐Target Activities with Mismatches and Indels

OPENALEX - Publications

Jiecong Lin Zhaolei Zhang Shixiong Zhang Junyi Chen Ka‐Chun Wong

Abstract The off‐target effects induced by guide RNAs in the CRISPR/Cas9 gene‐editing system have raised substantial concerns recent years. Many silico predictive models been developed for predicting activities; however, few are capable of activities with insertions or deletions between RNA and target DNA sequence pair. In order to fill this gap, a recurrent convolutional network named CRISPR‐Net is scoring gRNA‐target pairs mismatches indels; machine‐learning based model...

10.1002/advs.201903562 article EN cc-by Advanced Science 2020-05-20

MiR-590-3p Attenuates Acute Kidney Injury by Inhibiting Tumor Necrosis Factor Receptor-Associated Factor 6 in Septic Mice

OPENALEX - Publications

Jing Ma Yutao Li Shixiong Zhang Shouzhi Fu Xianzhi Ye

10.1007/s10753-018-0921-5 article EN Inflammation 2018-11-03

Complex Neural Spatial Filter: Enhancing Multi-Channel Target Speech Separation in Complex Domain

OPENALEX - Publications

Rongzhi Gu Shixiong Zhang Yuexian Zou Dong Yu

To date, mainstream target speech separation (TSS) approaches are formulated to estimate the complex ratio mask (cRM) of in time-frequency domain under supervised deep learning framework.However, existing models for estimating cRM designed way that real and imaginary parts separately modeled using real-valued training data pairs.The research motivation this study is design a model fully exploits temporal-spectral-spatial information multi-channel signals directly efficiently domain.As...

10.1109/lsp.2021.3076374 article EN IEEE Signal Processing Letters 2021-01-01

Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation

OPENALEX - Publications

Yong Xu Zhuohuang Zhang Meng Yu Shixiong Zhang Dong Yu

Although the conventional mask-based minimum variance distortionless response (MVDR) could reduce non-linear distortion, residual noise level of MVDR separated speech is still high.In this paper, we propose a spatio-temporal recurrent neural network based beamformer (RNN-BF) for target separation.This new beamforming framework directly learns weights from estimated and spatial covariance matrices.Leveraging on temporal modeling capability RNNs, RNN-BF automatically accumulate statistics...

10.21437/interspeech.2021-430 article EN Interspeech 2022 2021-08-27

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

OPENALEX - Publications

Rongzhi Gu Shixiong Zhang Yuexian Zou Dong Yu

Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation. In parallel, the integration of time network structure and also gains significant attention. This study proposes a novel method in makes an attempt to unify pipelines The proposed model consists two modules: separation beamforming. Both modules perform temporal-spectral-spatial modeling are trained from end-to-end using joint loss function. novelty this lies folds....

10.1109/taslp.2022.3229261 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2022-12-14

Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator

OPENALEX - Publications

Anton Ratnarajah Shixiong Zhang Meng Yu Zhenyu Tang Dinesh Manocha and 1 more

We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating responses (RIRs) given acoustic environment. Our FAST-RIR takes rectangular dimensions, listener and speaker positions, reverberation time (T <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">60</inf> ) as inputs generates specular reflections is capable of RIRs input T with an average error 0.02s. evaluate our generated in automatic speech...

10.1109/icassp43922.2022.9747846 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Synergizing CRISPR/Cas9 off-target predictions for ensemble insights and practical applications

OPENALEX - Publications

Shixiong Zhang Xiangtao Li Qiuzhen Lin Ka‐Chun Wong

Abstract Motivation The RNA-guided CRISPR/Cas9 system has been widely applied to genome editing. can effectively edit the on-target genes. Nonetheless, it recently demonstrated that many homologous off-target genomic sequences could be mutated, leading unexpected gene-editing outcomes. Therefore, a plethora of tools were proposed for prediction activities CRISPR/Cas9. each computational tool its own advantages and drawbacks under diverse conditions. It is hardly believed single optimal all...

10.1093/bioinformatics/bty748 article EN Bioinformatics 2018-08-30

Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation

OPENALEX - Publications

Zhuohuang Zhang Yong Xu Meng Yu Shixiong Zhang Lianwu Chen and 2 more

Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful modern automatic recognition (ASR) systems. Minimum variance distortionless response (MVDR) filters adopted remove distortions, however, conventional mask-based MVDR systems still result in relatively high levels of residual noise. Moreover, the matrix inverse involved solution is sometimes numerically...

10.1109/taslp.2021.3129335 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Bazi Bushen capsule attenuates cognitive deficits by inhibiting microglia activation and cellular senescence

OPENALEX - Publications

Chuanyuan Ji Cong Wei Mengnan Li Shuang Shen Shixiong Zhang and 2 more

Context Bazi Bushen capsule (BZBS) has anti-ageing properties and is effective in enhancing memory.Objective To find evidence supporting the mechanisms biomarkers by which BZBS functions.Materials methods Male C57BL/6J mice were randomly divided into five groups: normal, ageing, β-nicotinamide mononucleotide (NMN), low-dose (LD-BZ) high-dose (HD-BZ). The last four groups subcutaneously injected with d-galactose (d-gal, 100 mg/kg/d) to induce ageing process. At same time, LD-BZ, HD-BZ NMN...

10.1080/13880209.2022.2131839 article EN cc-by-nc Pharmaceutical Biology 2022-10-20

MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning

OPENALEX - Publications

Ruize Xu Ruoxuan Feng Shixiong Zhang Di Hu

Audio-visual learning helps to comprehensively under-stand the world by fusing practical information from multiple modalities. However, recent studies show that imbalanced optimization of uni-modal encoders in a joint-learning model is bottleneck enhancing model's performance. We further find up-to-date imbalance-mitigating methods fail on some audio-visual fine-grained tasks, which have higher demand for distinguishable feature distribution. Fueled success cosine loss builds hyperspherical...

10.1109/icassp49357.2023.10096655 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Bazi Bushen Capsule Modulates Akkermansia muciniphila and Spermidine Metabolism to Attenuate Brain Aging in SAMP8 Mice

OPENALEX - Publications

Shixiong Zhang Xinjing Mao Liping Chang Mengnan Li Cong Wei and 14 more

10.1016/j.jep.2025.119944 article EN Journal of Ethnopharmacology 2025-05-01

Encrypted Speech Recognition Using Deep Polynomial Networks

OPENALEX - Publications

Shixiong Zhang Yifan Gong Dong Yu

The cloud-based speech recognition/API provides developers or enterprises an easy way to create speech-enabled features in their applications. However, sending audios about personal company internal information the cloud, raises concerns privacy and security issues. recognition results generated cloud may also reveal some sensitive information. This paper proposes a deep polynomial network (DPN) that can be applied encrypted as acoustic model. It allows clients send data form ensure remains...

10.1109/icassp.2019.8683721 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

A novel method based on FTS with both GA-FCM and multifactor BPNN for stock forecasting

OPENALEX - Publications

Wenyu Zhang Shixiong Zhang Shuai Zhang Dejian Yu Ningning Huang

10.1007/s00500-018-3335-2 article EN Soft Computing 2018-06-22

Nature-Inspired Multiobjective Epistasis Elucidation from Genome-Wide Association Studies

OPENALEX - Publications

Xiangtao Li Shixiong Zhang Ka‐Chun Wong

In recent years, the detection of epistatic interactions multiple genetic variants on causes complex diseases brings a significant challenge in genome-wide association studies (GWAS). However, most existing methods still suffer from algorithmic limitations such as single-objective optimization, intensive computational requirement, and premature convergence. this paper, we propose formulate an interaction multi-objective artificial bee colony algorithm based decomposition (EIMOABC/D) to...

10.1109/tcbb.2018.2849759 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2018-06-22

Single-cell RNA-seq interpretations using evolutionary multiobjective ensemble pruning

OPENALEX - Publications

Xiangtao Li Shixiong Zhang Ka‐Chun Wong

Abstract Motivation In recent years, single-cell RNA sequencing enables us to discover cell types or even subtypes. Its increasing availability provides opportunities identify populations from RNA-seq data. Computational methods have been employed reveal the gene expression variations among multiple populations. Unfortunately, existing ones can suffer realistic restrictions such as experimental noises, numerical instability, high dimensionality and computational scalability. Results We...

10.1093/bioinformatics/bty1056 article EN Bioinformatics 2018-12-21

A multi-factor and high-order stock forecast model based on Type-2 FTS using cuckoo search and self-adaptive harmony search

OPENALEX - Publications

Wenyu Zhang Shixiong Zhang Shuai Zhang Dejian Yu Ningning Huang

10.1016/j.neucom.2017.02.054 article EN Neurocomputing 2017-02-22

Coming Soon ...