NFDI4DS | UHH-SEMS - Publication Details

SMS-WSJ: Database, performance measures, and baseline recipe for multi-channel source separation and recognition

FOS: Computer and information sciences 03 medical and health sciences Sound (cs.SD) Computer Science - Computation and Language Audio and Speech Processing (eess.AS) FOS: Electrical engineering, electronic engineering, information engineering 0305 other medical science Computation and Language (cs.CL) Computer Science - Sound Electrical Engineering and Systems Science - Audio and Speech Processing

DOI: 10.48550/arxiv.1910.13934 Publication Date: 2019-01-01

Abstract Supplemental Material References Cited by

AUTHORS (4)

Drude, Lukas

Heitkaemper, Jens

Boeddeker, Christoph

Haeb-Umbach, Rein...

ABSTRACT

Submitted to ICASSP 2020<br/>We present a multi-channel database of overlapping speech for training, evaluation, and detailed analysis of source separation and extraction algorithms: SMS-WSJ -- Spatialized Multi-Speaker Wall Street Journal. It consists of artificially mixed speech taken from the WSJ database, but unlike earlier databases we consider all WSJ0+1 utterances and take care of strictly separating the speaker sets present in the training, validation and test sets. When spatializing the data we ensure a high degree of randomness w.r.t. room size, array center and rotation, as well as speaker position. Furthermore, this paper offers a critical assessment of recently proposed measures of source separation performance. Alongside the code to generate the database we provide a source separation baseline and a Kaldi recipe with competitive word error rates to provide common ground for evaluation.<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

SMS-WSJ: Database, performance measures, and baseline recipe for multi-channel source separation and recognition

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....