NFDI4DS | UHH-SEMS - Publication Details

Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos

0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology

DOI: 10.1007/s11042-018-5893-9 Publication Date: 2018-03-21T22:08:50Z

Abstract Supplemental Material References Cited by

AUTHORS (3)

Bo Meng

XueJun Liu

Xiaolin Wang

ABSTRACT

Convolutional neural networks (CNN) are the state-of-the-art method for action recognition in various kinds of datasets. However, most existing CNN models are based on lower-level handcrafted features from gray or RGB image sequences from small datasets, which are incapable of being generalized for application to various realistic scenarios. Therefore, we propose a new deep learning network for action recognition that integrates quaternion spatial-temporal convolutional neural network (QST-CNN) and Long Short-Term Memory network (LSTM), called QST-CNN-LSTM. Unlike a traditional CNN, the input for a QST-CNN utilizes a quaternion expression for an RGB image, and the values of the red, green, and blue channels are considered simultaneously as a whole in a spatial convolutional layer, avoiding the loss of spatial features. Because the raw images in video datasets are large and have background redundancy, we pre-extract key motion regions from RGB videos using an improved codebook algorithm. Furthermore, the QST-CNN is combined with LSTM for capturing the dependencies between different video clips. Experiments demonstrate that QST-CNN-LSTM is effective for improving recognition rates in the Weizmann, UCF sports, and UCF11 datasets.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (48)

CITATIONS (43)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products

PlumX Metrics

Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....