NFDI4DS | UHH-SEMS - Publication Details

TF-Mamba: A Time-Frequency Network for Sound Source Localization

FOS: Computer and information sciences Sound (cs.SD) Audio and Speech Processing (eess.AS) FOS: Electrical engineering, electronic engineering, information engineering Computer Science - Sound Electrical Engineering and Systems Science - Audio and Speech Processing

DOI: 10.48550/arxiv.2409.05034 Publication Date: 2024-09-08

Abstract Supplemental Material References Cited by

AUTHORS (2)

Yang Xiao

Rohan Kumar Das

ABSTRACT

Sound source localization (SSL) determines the position of sound sources using multi-channel audio data. It is commonly used to improve speech enhancement and separation. Extracting spatial features crucial for SSL, especially in challenging acoustic environments. Previous studies performed well based on long short-term memory models. Recently, a novel scalable SSM referred as Mamba demonstrated notable performance across various sequence-based modalities, including speech. This study introduces SSL tasks. We consider Mamba-based model analyze from signals by fusing both time frequency features, we develop an system called TF-Mamba. integrates fusion, with Bidirectional managing time-wise frequency-wise processing. conduct experiments simulated dataset LOCATA dataset. Experiments show that TF-Mamba significantly outperforms other advanced methods real-world

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

TF-Mamba: A Time-Frequency Network for Sound Source Localization

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....