Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier

Pooling Spoofing attack
DOI: 10.48550/arxiv.2312.08089 Publication Date: 2023-01-01
ABSTRACT
With the rapid development of speech synthesis and voice conversion technologies, Audio Deepfake has become a serious threat to Automatic Speaker Verification (ASV) system. Numerous countermeasures are proposed detect this type attack. In paper, we report our efforts combine self-supervised WavLM model Multi-Fusion Attentive classifier for audio deepfake detection. Our method exploits extract features that more conducive spoofing detection first time. Then, propose novel (MFA) based on Statistics Pooling (ASP) layer. The MFA captures complementary information at both time layer levels. Experiments demonstrate methods achieve state-of-the-art results ASVspoof 2021 DF set provide competitive 2019 LA set.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....