NFDI4DS | UHH-SEMS - Publication Details

CNN-Based Processing of Acoustic and Radio Frequency Signals for Speaker Localization from MAVs

Direction of arrival and distance regression; Multi-stage convolutional neural network; RF fusion; RF-assisted multi-channel speech; Speaker localization; Speech 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology

DOI: 10.21437/interspeech.2021-886 Publication Date: 2021-08-27T05:59:39Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Andrea Toma

Daniele Salvati

Carlo Drioli

Gian Luca Foresti

ABSTRACT

A novel speaker localization algorithm from micro aerial vehicles (MAVs) is investigated. It introduces a joint direction of arrival (DOA) and distance prediction method based on processing and fusion of the multi-channel speech data with radio frequency (RF) measurements of the received signal strength. Possible applications include unmanned aerial vehicles (UAVs)based reconnaissance and surveillance against intrusions and search and rescue in hostile environments. A 3-stages convolutional neural network (CNN) with a fusion layer is proposed to perform this task with the objective of augmenting the source localization from multi-channel speech signals. Two parallel CNNs process the speech and RF data, and the regression network produces predictions of the angle and distance from the source after the fusion layer. To show the performance and effectiveness of this RF-assisted method, the experimental scenario and datasets are presented and experiments are then discussed along with the results that have been obtained.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (3)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products

PlumX Metrics

CNN-Based Processing of Acoustic and Radio Frequency Signals for Speaker Localization from MAVs

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....