- Speech and Audio Processing
- Advanced Adaptive Filtering Techniques
- Blind Source Separation Techniques
- Speech Recognition and Synthesis
- Music and Audio Processing
- Hearing Loss and Rehabilitation
- Advanced Data Compression Techniques
- Acoustic Wave Phenomena Research
- Phonetics and Phonology Research
- Direction-of-Arrival Estimation Techniques
- Image and Signal Denoising Methods
- PAPR reduction in OFDM
- Advanced Power Amplifier Design
- Indoor and Outdoor Localization Technologies
- Underwater Acoustics Research
- Infant Health and Development
- Music Technology and Sound Studies
- Numerical methods in engineering
- Hand Gesture Recognition Systems
- Geophysics and Gravity Measurements
- Digital Imaging for Blood Diseases
- Iterative Methods for Nonlinear Equations
- Hearing Impairment and Communication
- Ultrasonics and Acoustic Wave Propagation
- Vehicle Noise and Vibration Control
GN Store Nord (Denmark)
2022-2024
Datta Meghe Institute of Medical Sciences
2024
Lyngsø Marine (Denmark)
2020
Graz University of Technology
2013-2019
Institut für Informationsverarbeitung
2013-2019
Widex (Denmark)
2018-2019
Signal Processing (United States)
2015-2016
Ruhr University Bochum
2011-2013
Amirkabir University of Technology
2007-2011
Aalborg University
2010
Many short-time Fourier transform (STFT) based single-channel speech enhancement algorithms are focused on estimating the clean spectral amplitude from noisy observed signal in order to suppress additive noise. To this end, they utilize information and corresponding a priori posteriori SNRs while employ phase when reconstructing enhanced signal. This paper presents two contributions: i) reconsidering relation between group delay deviation deviation, ii) proposing closed-loop approach...
Single-channel speech separation algorithms frequently ignore the issue of accurate phase estimation while reconstructing enhanced signal. Instead, they directly employ mixed-signal for signal reconstruction which leads to undesired traces interfering source in target In this paper, assuming a given knowledge spectrum amplitude, we present solution estimate information sources from single-channel mixture observation. We first investigate effectiveness proposed method employing known...
In conventional single-channel speech enhancement, typically the noisy spectral amplitude is modified while phase used to reconstruct enhanced signal. Several recent attempts have shown effectiveness of utilizing an improved for phase-aware enhancement and consequently its positive impact on perceived quality. this paper, we present a harmonic estimation method relying fundamental frequency signal-to-noise ratio (SNR) information estimated from speech. The proposed relies SNR-based...
Sum-product networks (SPNs) are a recently proposed type of probabilistic graphical models allowing complex variable interactions while still granting efficient inference. In this paper we demonstrate the suitability SPNs for modeling log-spectra speech signals using application artificial bandwidth extension, i.e. artificially replacing high-frequency content which is lost in telephone signals. We use as observation hidden Markov (HMMs), model temporal evolution log short-time spectra....
In this paper, we present an overview on the previous and recent methods proposed to estimate a clean spectral phase from noisy observation in context of single-channel speech enhancement. The importance estimation enhancement is inspired by reports its usefulness finding phase-sensitive amplitude estimation. We comparative study elaborate their limits. propose new method relying decomposition time-frequency smoothing filters. demonstrate that successfully reduces variance at harmonics. Our...
Conventional speech enhancement methods typically utilize the noisy phase spectrum for signal reconstruction. This letter presents a novel method to estimate clean spectrum, given observation in single-channel enhancement. The proposed relies on decomposition of instantaneous followed by temporal smoothing order reduce large variance phase, and consequently reconstructs an enhanced effectiveness is evaluated two ways: enhancement-only quantifying additional improvement top conventional...
Finding and monitoring many diseases, including infections leukaemia, depend on the classification of white blood cells (WBCs). Conventional techniques mostly rely seeing samples under a microscope calling for an expert to evaluate outcomes. This method takes lot time human mistake might create errors. work uses lightweight convolutional neural network (CNN) provide basic model enhance automated categorisation (WBCs) via deep learning. The proposed paradigm maximises computational...
In this paper, we present a novel system for joint speaker identification and speech separation. For single-channel algorithm is proposed which provides an estimate of signal-to-signal ratio (SSR) as by-product. separation, propose sinusoidal model-based algorithm. The separation consists double-talk/single-talk detector followed by minimum mean square error estimator parameters finding optimal codevectors from pre-trained codebooks. evaluating the system, start situation where have prior...
Previous studies on performance evaluation of single-channel speech separation (SCSS) algorithms mostly focused automatic recognition (ASR) accuracy as their measure. Assessing the separated signals by different metrics other than this has benefit that results are expected to carry applications beyond ASR. In paper, in addition conventional quality (PESQ and SNR <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">loss</sub> ), we also evaluate...
Time-frequency masking is a common solution for the single-channel source separation (SCSS) problem where goal to find time-frequency mask that separates underlying sources from an observed mixture. An estimated then applied mixed signal extract desired signal. During reconstruction, time-frequency–masked spectral amplitude combined with mixture phase. This article considers impact of replacing phase clean magnitude spectrum using conventional model-based approach. As proposed estimator...
We present new results on single-channel speech separation and suggest a approach to improve the quality of separated signals from an observed mixture. The key idea is derive mixture estimator based sinusoidal parameters. proposed aimed at finding parameters in form codevectors vector quantization (VQ) codebooks pre-trained for speakers that, when combined, best fit mixed signal. selected are then used reconstruct recovered Compared log-max binary masks Wiener filtering approach, it that...
In this paper, we study the impact of exploiting spectral phase information to further improve speech quality single-channel enhancement algorithms. particular, focus on two required steps in a typical system, namely: parameter estimation solved by minimum mean square error (MMSE) estimator amplitude, followed signal reconstruction stage, where observed noisy is often used. For contrast conventional Wiener filter, new MMSE derived which takes into account clean as prior information. our...
To approximate the speech quality of a given enhancement system, most existing instrumental metrics rely on calculation distortion metric defined between clean reference signal and enhanced in spectral amplitude domain. Several recent studies have demonstrated effectiveness employing phase modification stage single-channel showing positive impact brought by modifying both contrast to conventional methods where noisy is only modified used for reconstruction. In this work we present two...
In single-channel speech enhancement the spectral amplitude of noisy signal is often modified while phase directly employed for reconstruction. Recently, additional improvement in performance has been reported when modified. this work, we propose a Bayesian estimator harmonics given speech. The proposed relies on fundamental frequency and signal-to-noise ratio at harmonics. Throughout our experiments, evaluate comparison with phase, benchmark clean as upper-bound. method leads to joint...
In this paper, we consider speaker identification for the co-channel scenario in which speech mixture from speakers is recorded by one microphone only. The goal to identify both of their mixed signal. High recognition accuracies have already been reported when an accurately estimated signal-to-signal ratio (SSR) available. approach problem without estimating SSR. We show that a simple method based on fusion adapted Gaussian models and Kullback-Leibler divergence calculated between models,...
Previous single-channel speech enhancement algorithms often employ noisy phase while reconstructing the enhanced signal. In this paper, we propose novel estimation methods by employing several temporal and spectral constraints imposed on spectrum of We pose problem as estimating unknown clean at sinusoids observed in additive noise. To resolve ambiguity problem, introduce individual time-frequency constraints: group delay deviation, instantaneous frequency relative shift. Through extensive...
In many artificial intelligence systems human voice is considered as the medium for information transmission. Human-machine communication by becomes difficult when speech mixed with some background noise. As a remedy, single-channel enhancement indispensable reducing noise from noisy to make it suitable automatic recognition and telephony speech. While conventional techniques incorporate phase in both amplitude estimation signal reconstruction stages, this paper we propose probabilistic...
In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in speaker independent scenario. The is composed two interconnected blocks: block and identification block. improvement accomplished by incorporating identities found as additional information for block, which converts speaker-independent problem speaker-dependent one where codebooks are known. Simulation results show that enhances quality separated output signals. To assess...
Partial phase reconstruction based on a confidence domain has recently been shown to provide improved signal performance in single-channel source separation scenario. In this paper, we replace the previous binarized fixed-threshold with new signal-dependent one estimated by employing sinusoidal model be applied magnitude spectrum of underlying sources mixture. We also extend sinusoidal-based into Multiple Input Spectrogram Inversion (MISI) framework, and propose re-distribute remixing error...
In this work, we present a universal codebook-based speech enhancement framework that relies on single codebook to encode both and noise components. The atomic presence probability (ASPP) is defined as the given atom encodes at point in time. We develop ASPP estimators based binaural cues including interaural phase level difference (IPD ILD), coherence magnitude (ICM), well combined version leveraging full transfer function (ITF). evaluate performance of resulting ASPP-based algorithms...