NFDI4DS | UHH-SEMS - Publication Details

The importance of phase in speech enhancement

OPENALEX - Publications

Kuldip K. Paliwal Kamil Wójcicki Benjamin J. Shannon

10.1016/j.specom.2010.12.003 article EN Speech Communication 2010-12-25

Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

OPENALEX - Publications

Kuldip K. Paliwal Kamil Wójcicki Belinda Schwerin

10.1016/j.specom.2010.02.004 article EN Speech Communication 2010-02-20

Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator

OPENALEX - Publications

Kuldip K. Paliwal Belinda Schwerin Kamil Wójcicki

10.1016/j.specom.2011.09.003 article EN Speech Communication 2011-09-25

Exploiting Conjugate Symmetry of the Short-Time Fourier Spectrum for Speech Enhancement

OPENALEX - Publications

Kamil Wójcicki Mitar Milacic Anthony Stark James Lyons Kuldip K. Paliwal

Typical speech enhancement algorithms operate on the short-time magnitude spectrum, while keeping phase spectrum unchanged for synthesis. We propose a novel approach where noisy is recombined with changed to produce modified complex spectrum. During synthesis, low energy components of cancel out more than high components, thus reducing background noise. Using objective quality measures, informal subjective listening tests and spectrogram analysis, we show that proposed method results in...

10.1109/lsp.2008.923579 article EN IEEE Signal Processing Letters 2008-01-01

Preference for 20-40 ms window duration in speech analysis

OPENALEX - Publications

Kuldip K. Paliwal James Lyons Kamil Wójcicki

In speech processing the short-time magnitude spectrum is believed to contain most of information about intelligibility and it normally computed using Fourier transform over 20-40 ms window duration. this paper, we investigate effect analysis duration on in a systematic way. For purpose, both subjective objective experiments are conducted. The experiment form consonant recognition task by human listeners, whereas an automatic (ASR) task. our various durations investigated. construct stimuli...

10.1109/icspcs.2010.5709770 article EN 2010-12-01

Noise driven short-time phase spectrum compensation procedure for speech enhancement

OPENALEX - Publications

Anthony Stark Kamil Wójcicki James Lyons Kuldip K. Paliwal Kuldip K. Paliwal

Typical speech enhancement algorithms operate on the shorttime magnitude spectrum, while keeping short-time phase spectrum unchanged for synthesis. Recently, a novel approach to has been proposed where noisy is recombined with changed produce modified complex spectrum. During synthesis low energy components of cancel out more than high components, thus reducing background noise. In present work, procedure that employs noise estimates compensate additive distortion formulated. The objectively...

10.21437/interspeech.2008-163 article EN Interspeech 2022 2008-09-22

Effect of Analysis Window Duration on Speech Intelligibility

OPENALEX - Publications

Kuldip K. Paliwal Kamil Wójcicki

In this letter, we investigate the effect of analysis window duration on speech intelligibility in a systematic way. processing, short-time magnitude spectrum is believed to contain majority intelligible information. Consequently, our experiments, construct stimuli based purely spectrum. We conduct subjective listening tests form consonant recognition task assess as function duration. investigations, also employ three objective measures transmission index (STI). The experimental results show...

10.1109/lsp.2008.2005755 article EN IEEE Signal Processing Letters 2008-01-01

Channel selection in the modulation domain for improved speech intelligibility in noise

OPENALEX - Publications

Kamil Wójcicki Philipos C. Loizou

Background noise reduces the depth of low-frequency envelope modulations known to be important for speech intelligibility. The relative strength target and masker can quantified using a modulation signal-to-noise ratio, (S/N)mod, measure. Such measure used in noise-suppression algorithms extract target-relevant from corrupted (target + masker) envelopes potential improvement In present study, are decomposed spectral domain into number channels spanning range 0–30 Hz. Target-dominant...

10.1121/1.3688488 article EN The Journal of the Acoustical Society of America 2012-04-01

Role of modulation magnitude and phase spectrum towards speech intelligibility

OPENALEX - Publications

Kuldip K. Paliwal Belinda Schwerin Kamil Wójcicki

10.1016/j.specom.2010.10.004 article EN Speech Communication 2010-10-26

Speech-Signal-Based Frequency Warping

OPENALEX - Publications

Kuldip K. Paliwal Benjamin J. Shannon James Lyons Kamil Wójcicki

The speech signal is used for transmission of linguistic information. High energy portions the spectrum have higher signal-to-noise ratios than low portions. As a result, these regions are more robust to noise. Since known be very noise, it expected that high carry majority This letter tries derive frequency warping function directly from by sampling axis nonuniformly with sampled densely regions. To achieve this, an ensemble average short-time power computed large corpus....

10.1109/lsp.2009.2014096 article EN IEEE Signal Processing Letters 2009-03-09

Single-channel speech enhancement using kalman filtering in the modulation domain

OPENALEX - Publications

Stephen So Kamil Wójcicki Kuldip K. Paliwal

In this paper, we propose the modulation-domain Kalman filter (MDKF) for speech enhancement. contrast to previous modulation domain-enhancement methods based on bandpass filtering, MDKF is an adaptive and linear MMSE estimator that uses models of temporal changes magnitude spectrum both noise. Also, because a joint phase estimator, under non-stationarity assumptions, it highly suited modulationdomain processing, as tends contain more information than acoustic phase. Experimental results from...

10.21437/interspeech.2010-330 article EN Interspeech 2022 2010-09-26

Evaluation of the importance of time-frequency contributions to speech intelligibility in noise

OPENALEX - Publications

Chengzhu Yu Kamil Wójcicki Philipos C. Loizou John H. L. Hansen Michael T. Johnson

Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to overall intelligibility of speech. The present study demonstrated importance T-F speech varies in accordance with content. Specifically, units are categorized into two classes, speech-present and speech-absent units. Results indicate is highly related loudness its target component, while according masker component. Two types mask errors also considered, which...

10.1121/1.4869088 article EN The Journal of the Acoustical Society of America 2014-05-01

Importance of the Dynamic Range of an Analysis Windowfunction for Phase-Only and Magnitude-Only Reconstruction of Speech

OPENALEX - Publications

Kamil Wójcicki Kuldip K. Paliwal

The short-time Fourier transform (STFT) of a speech signal has two components: the magnitude spectrum and phase spectrum. It is traditionally believed that plays dominant role for perception at small window durations (20-40 ms). However, recent perceptual studies have shown can contribute as much to intelligibility was observed use rectangular (non-tapered) analysis computation more advantageous than Hamming (tapered) window. This paper investigates effect dynamic range an on phase-only...

10.1109/icassp.2007.367016 article EN 2007-04-01

Comparative evaluation of speech enhancement methods for robust automatic speech recognition

OPENALEX - Publications

Kuldip K. Paliwal James Lyons Stephen So Anthony Stark Kamil Wójcicki

A comparative evaluation of speech enhancement algorithms for robust automatic recognition is presented. The performed on a core test set the TIMIT corpus. Mean objective quality scores as well ASR correctness under two noise conditions are given.

10.1109/icspcs.2010.5709761 article EN 2010-12-01

Kalman fitler with phase spectrum compensation algorithm for speech enhancement

OPENALEX - Publications

Stephen So Kamil Wójcicki James Lyons Anthony Stark Kuldip K. Paliwal

In this paper, we propose to combine the Kalman filter with a recent speech enhancement technique, called phase spectrum compensation procedure, or PSC. More specifically, apply PSC technique initialise filter, whereby is used clean noisy prior LPC estimation for recursion. We refer combined as Kalman-PSC filter. Using an objective quality measure, formal subjective listening tests and spectrogram analysis, show that proposed method results in improved quality.

10.1109/icassp.2009.4960606 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2009-04-01

Single channel speech enhancement using MMSE estimation of short-time modulation magnitude spectrum

OPENALEX - Publications

Kuldip K. Paliwal Belinda Schwerin Kamil Wójcicki

In this paper we investigate the enhancement of speech by applying MMSE short-time spectral magnitude estimation in modulation domain. For purpose, traditional analysismodification-synthesis framework is extended to include domain processing. We compensate noisy spectrum for additive noise distortion algorithm Subjective experiments were conducted compare quality stimuli processed estimator those using acoustic and subtraction method. The proposed method shown have better suppression than...

10.21437/interspeech.2011-425 article EN Interspeech 2022 2011-08-27

Crowdsourced Multilingual Speech Intelligibility Testing

OPENALEX - Publications

Laura Lechler Kamil Wójcicki

With the advent of generative audio features, there is an increasing need for rapid evaluation their impact on speech intelligibility. Beyond existing laboratory measures, which are expensive and do not scale well, has been comparatively little work crowdsourced assessment Standards recommendations yet to be defined, publicly available multilingual test materials lacking. In response this challenge, we propose approach a intelligibility assessment. We detail design, collection public release...

10.1109/icassp48485.2024.10447869 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

A new mask-based objective measure for predicting the intelligibility of binary masked speech

OPENALEX - Publications

Chengzhu Yu Kamil Wójcicki Philip C. Loizou John H. L. Hansen

Mask-based objective speech-intelligibility measures have been successfully proposed for evaluating the performance of binary masking algorithms. These were computed directly by comparing estimated mask against ground truth ideal (IdBM). Most these measures, however, assign equal weight to all time-frequency (T-F) units. In this study, we propose improve existing mask-based weighting each T-F unit according its target or masker loudness. The measure shows significantly better than two other measures.

10.1109/icassp.2013.6639025 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2013-05-01

Modulation domain spectral subtraction for speech enhancement

OPENALEX - Publications

K.K. Paliwal Belinda Schwerin Kamil Wójcicki

In this paper we investigate the modulation domain as an alternative to acoustic for speech enhancement. More specifically, wish determine how competitive is spectral subtraction compared domain. For purpose, extend traditional analysis-modification-synthesis framework include processing. We then compensate noisy spectrum additive noise distortion by applying algorithm in Using subjective listening tests and objective quality evaluation show that proposed method results improved quality....

10.21437/interspeech.2009-413 article EN Interspeech 2022 2009-09-06

Dual-microphone phase-difference-based SNR estimation with applications to speech enhancement

OPENALEX - Publications

Frédéric Mustière Renato Nakagawa Kamil Wójcicki Ivo Merks Tao Zhang

This paper introduces novel two-channel a priori Signal-to-Noise Ratio (SNR) estimators for use in frequency-domain speech enhancement algorithms. The SNR estimation is based on statistics of the noisy phase difference between two microphones each frequency bin. Namely, corresponding probability distribution derived assuming complex Gaussian model, and written terms only. shifts problem into classical statistical parameter problem, which we propose to solve via an online version Method...

10.1109/iwaenc.2016.7602935 article EN 2016-09-01

The effect of the additivity assumption on time and frequency domain wiener filtering for speech enhancement

OPENALEX - Publications

Kamil Wójcicki Stephen So Kuldip K. Paliwal

In this paper, we investigate the validity of common assumption made in Wiener filtering that clean speech and noise signals are uncorrelated under short-time analysis typically used for enhancement. order to achieve have performed enhancement experiments, where corrupted by additive white Gaussian is enhanced a filter designed time as well frequency domains. Results oracle-style experiments confirm inclusion additivity results negligible degradation quality. Informal listening tests show...

10.21437/interspeech.2007-302 article EN Interspeech 2022 2007-08-27

Sidechain harmonic enhancement of noise corrupted speech for hearing impaired listeners

OPENALEX - Publications

Kamil Wójcicki Kelly Fitz Karrie Recker Don R. Reynolds Tao Zhang

This work presents a single channel speech enhancement approach aimed at improving clarity for hearing impaired listeners under challenging listening conditions. The proposed method applies nonlinear distortions to components isolated from the observed noisy signal using aggressive enhancement. enhanced are then mixed back into signal. results show that significantly improves in noise.

10.1109/waspaa.2015.7336926 article EN 2015-10-01