HalluAudio: Hallucinating Frequency as Concepts for Few-Shot Audio Classification
FOS: Computer and information sciences
Sound (cs.SD)
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
Audio and Speech Processing (eess.AS)
0202 electrical engineering, electronic engineering, information engineering
FOS: Electrical engineering, electronic engineering, information engineering
02 engineering and technology
Computer Science - Sound
Electrical Engineering and Systems Science - Audio and Speech Processing
DOI:
10.48550/arxiv.2302.14204
Publication Date:
2023-06-04
AUTHORS (4)
ABSTRACT
Few-shot audio classification is an emerging topic that attracts more and more attention from the research community. Most existing work ignores the specificity of the form of the audio spectrogram and focuses largely on the embedding space borrowed from image tasks, while in this work, we aim to take advantage of this special audio format and propose a new method by hallucinating high-frequency and low-frequency parts as structured concepts. Extensive experiments on ESC-50 and our curated balanced Kaggle18 dataset show the proposed method outperforms the baseline by a notable margin. The way that our method hallucinates high-frequency and low-frequency parts also enables its interpretability and opens up new potentials for the few-shot audio classification.<br/>Accepted at ICASSP 2023<br/>
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....