Hand gestures are one of the most intuitive and natural ways for communication between human and computer. Recently, the role of hand gesture recognition has become more significant in human-computer interaction applications due to its convenience and...
Hand gestures are one of the most intuitive and natural ways for communication between human and computer. Recently, the role of hand gesture recognition has become more significant in human-computer interaction applications due to its convenience and naturalness. Hand gestures recognition based method is the topic that is increasingly attracted much research and development.
In this paper, we present a novel approach for continuous dynamic hand gesture recognition. Our approach contains two main modules. Firstly, in the gesture spotting module, the video sequence with continuous gestures are pre-segmented into isolated gestures. Secondly, the gesture classification module classifies the segmented gestures. In the gesture spotting module, the motion of the hand palm and finger movements are fed into a Bidirectional Long Short-Term Memory (Bi-LSTM) network for gesture spotting purpose. In the gesture classification module, three residual 3D Convolution Neural Networks based on ResNet architectures (3D_ResNet) and one Long Short-Term Memory (LSTM) network are combined to efficiently utilize the combination of multiple data channels such as RGB, Optical Flow, Depth and 3D position of key joints.
The promising performance of our approach is obtained by experiments conducted on three public datasets – Chalearn LAP ConGD dataset, 20BN-Jester, and NVIDIA Dynamic Hand gesture Dataset. Our approach achieves mean Jaccard Index of 0.6159, which outperforms the state-of-the-art methods on Chalearn LAP ConGD dataset.