Filamentary Convolution for SLI: A Brain-Inspired Approach with High Efficiency
Convolution (computer science)
DOI:
10.3390/s25103085
Publication Date:
2025-05-13T15:31:49Z
AUTHORS (5)
ABSTRACT
Spoken language identification (SLI) relies on detecting key frequency characteristics like pitch, tone, and rhythm. While the short-time Fourier transform (STFT) generates time–frequency acoustic features (TFAF) for deep learning networks (DLNs), rectangular convolution kernels cause mixing aliasing, degrading feature extraction. We propose filamentary to replace kernels, reducing parameters while preserving inter-frame by focusing solely patterns. Visualization confirms its enhanced sensitivity critical variations (e.g., intonation, rhythm) recognition. Evaluated via self-built datasets cross-validated with public corpora, improves low-level extraction efficiency synergizes temporal models (LSTM/TDNN) boost This method addresses aliasing limitations maintaining computational in SLI systems.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (48)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....