DenseGCN: A multi‐level and multi‐temporal graph convolutional network for action recognition
Discriminative model
Feature (linguistics)
RGB color model
Benchmark (surveying)
Pyramid (geometry)
DOI:
10.1049/ipr2.12872
Publication Date:
2023-08-01T08:27:59Z
AUTHORS (2)
ABSTRACT
Abstract With the exponential growth of video data, action recognition has become an increasingly important area study. Despite various advancements, achieving a balance between detection accuracy and lightness remains formidable challenge, primarily due to complexity existing models. To address this issue, DenseGCN is developed, lightweight network designed optimize efficiency. The aim was create model that high while remaining for real‐world applications. operates via unique three‐level feature fusion system. initial stage involves Multi‐level Fusion Network (MlFN), which contains dense connections Spatial‐Temporal Attention module (STF‐Att), eliminate bias in extraction caused by deep networks. In next stage, RefineBone tackles optimization issues low‐dimensional layers leveraging high‐dimensional layers, thus avoiding gradient stacking. Finally, Multi‐temporal Feature Pyramid (MF‐FPN) generates discriminative classification map repetitively combining data from multiple dimensions. This strategy proven successful refining extracted feature, allowing even with reduced number channels. efficient design not only contributes further research developing networks but also offers enhanced possibilities implementations. two large‐scale datasets, NTU RGB+D 60 120, outperformed other state‐of‐the‐art methods, 92.7% on X‐View benchmark dataset. 10.2 × faster 10 smaller than spatial temporal graph attention (STGAT) proposed 2022 retaining very competitive accuracy. findings suggest significantly improves quality extraction. As result, presents remarkable lightness.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (39)
CITATIONS (2)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....