NFDI4DS | UHH-SEMS - Publication Details

Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion

FOS: Computer and information sciences Sound (cs.SD) Computer Science - Computation and Language Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Computation and Language (cs.CL) Computer Science - Sound

DOI: 10.21437/interspeech.2023-2335 Publication Date: 2023-08-14T08:22:20Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Rui Liu

Jinhua Zhang

Guanglai Gao

Haizhou Li

ABSTRACT

To appear at InterSpeech2023<br/>Audio Deepfake Detection (ADD) aims to detect the fake audio generated by text-to-speech (TTS), voice conversion (VC) and replay, etc., which is an emerging topic. Traditionally we take the mono signal as input and focus on robust feature extraction and effective classifier design. However, the dual-channel stereo information in the audio signal also includes important cues for deepfake, which has not been studied in the prior work. In this paper, we propose a novel ADD model, termed as M2S-ADD, that attempts to discover audio authenticity cues during the mono-to-stereo conversion process. We first projects the mono to a stereo signal using a pretrained stereo synthesizer, then employs a dual-branch neural architecture to process the left and right channel signals, respectively. In this way, we effectively reveal the artifacts in the fake audio, thus improve the ADD performance. The experiments on the ASVspoof2019 database show that M2S-ADD outperforms all baselines that input mono. We release the source code at \url{https://github.com/AI-S2-Lab/M2S-ADD}.<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (8)

EXTERNAL LINKS

OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....