SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification

FOS: Computer and information sciences Sound (cs.SD) Engineering Audio and Speech Processing (eess.AS) FOS: Electrical engineering, electronic engineering, information engineering 0202 electrical engineering, electronic engineering, information engineering Computer Engineering 02 engineering and technology Computer Science - Sound Electrical Engineering and Systems Science - Audio and Speech Processing
DOI: 10.21437/interspeech.2021-140 Publication Date: 2021-08-27T05:59:39Z
ABSTRACT
In this paper, we present SpecAugment++, a novel data augmentation method for deep neural networks based acoustic scene classification (ASC).Different from other popular methods such as SpecAugment and mixup that only work on the input space, SpecAugment++ is applied to both space hidden of enhance intermediate feature representations.For an state, techniques consist masking blocks frequency channels time frames, which improve generalization by enabling model attend not most discriminative parts feature, but also entire parts.Apart using zeros masking, examine two approaches use samples within minibatch, helps introduce noises make them more classification.The experimental results DCASE 2018 Task1 dataset 2019 show our proposed can obtain 3.6% 4.7% accuracy gains over strong baseline without (i.e.CP-ResNet) respectively, outperforms previous methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (16)