MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
Spoofing attack
Realm
DOI:
10.48550/arxiv.2401.09512
Publication Date:
2024-01-01
AUTHORS (9)
ABSTRACT
Text-to-Speech (TTS) technology brings significant advantages, such as giving a voice to those with speech impairments, but also enables audio deepfakes and spoofs. The former mislead individuals may propagate misinformation, while the latter undermine biometric security systems. AI-based detection can help address these challenges by automatically differentiating between genuine fabricated recordings. However, models are only good their training data, which currently is severely limited due an overwhelming concentration on English Chinese in anti-spoofing databases, thus restricting its worldwide effectiveness. In response, this paper presents Multi-Language Audio Anti-Spoof Dataset (MLAAD), created using 52 TTS models, comprising 19 different architectures, generate 160.1 hours of synthetic 23 languages. We train evaluate three state-of-the-art deepfake MLAAD, observe that MLAAD demonstrates superior performance over comparable datasets like InTheWild or FakeOrReal when used resource. Furthermore, comparison renowned ASVspoof 2019 dataset, proves be complementary tests across eight datasets, alternately outperformed each other, both excelling four datasets. By publishing making trained accessible via interactive webserver , we aim democratize antispoofing technology, it beyond realm specialists, contributing global efforts against spoofing deepfakes.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....