Music emotion recognition using convolutional long short term memory deep neural networks

Mel-frequency cepstrum
DOI: 10.1016/j.jestch.2020.10.009 Publication Date: 2020-11-14T13:24:21Z
ABSTRACT
In this paper, we propose an approach for music emotion recognition based on convolutional long short term memory deep neural network (CLDNN) architecture. addition, construct a new Turkish emotional database composed of 124 traditional excerpts with duration 30 s each and the performance proposed is evaluated constructed database. We utilize features obtained by feeding (CNN) layers log-mel filterbank energies mel frequency cepstral coefficients (MFCCs) in addition to standard acoustic features. Classification results show that best when feature set combined using (LSTM) + (DNN) classi fier. The overall accuracy 99.19% system 10 fold cross-validation. Specifically, 6.45 points improvement achieved. Additionally, also LSTM DNN classifier yields 1.61, 1.61 3.23 improvements accuracies compared k-nearest neighbor (k-NN), support vector machine (SVM), Random Forest classifiers, respectively.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (52)
CITATIONS (86)