Voting for two speaker segmentation

Speaker diarisation Weighted voting
DOI: 10.21437/interspeech.2006-187 Publication Date: 2021-08-26T09:41:25Z
ABSTRACT
The process of locating the end points each speakers voice in an audio file and then clustering segments based speaker identity is called segmentation. In this paper we present a method for two segmentation, though it can be extended to more than speakers. Most methods segmentation start with initial computationally inexpensive method, followed by accurate segment clustering. describe simple algorithm that improves accuracy while not increasing computational complexity. Since done iteratively, improvement step results significant overall increase cluster purity. We borrow ideas from recognition perform frame voting. look at as independent classifier deciding which generated segment. These ’classifiers’ are combined voting make decision should clustered together. This change leads 56.9% decrease error rates on task SWITCHBOARD corpus. Index Terms: Speaker Voting combination, detection,
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....