TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms
Bone conduction
Wearable Technology
DOI:
10.48550/arxiv.2405.01242
Publication Date:
2024-05-02
AUTHORS (5)
ABSTRACT
We propose TRAMBA, a hybrid transformer and Mamba architecture for acoustic bone conduction speech enhancement, suitable mobile wearable platforms. Bone enhancement has been impractical to adopt in platforms several reasons: (i) data collection is labor-intensive, resulting scarcity; (ii) there exists performance gap between state of-art models with memory footprints of hundreds MBs methods better suited resource-constrained systems. To adapt TRAMBA vibration-based sensing modalities, we pre-train audio datasets that are widely available. Then, users fine-tune small amount data. outperforms state-of-art GANs by up 7.3% PESQ 1.8% STOI, an order magnitude smaller footprint inference speed 465 times. integrate into real systems show improves battery life wearables 160% requiring less sampling transmission; generates higher quality voice noisy environments than over-the-air speech; (iii) requires 20.0 MB.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....