Kinect hand tracking and pose recognition

The Kinect sensor is able to accurately (1 cm at 2 m of distance) measure the depth using a stereo pair composed by an infra-red laser projector and a monochrome sensor. The use of depth imagery simplify the foreground/background segmentation and, with the use of appropriate recognition algorithms, allows to easily track the body joints of multiple users.

The Kinect sensor it is therefore extremely useful in the implementation of natural user interfaces. One of the most critical limitation of Kinect based interfaces is the need of persistence in order to interact with virtual objects. Indeed a user must keep her arm still for a not so short span of time while pointing at an object that she wants to interact with. The most natural way to overcome this limitation and improve the interface reactivity is to employ a vision module able to recognize simple hand poses (e.g. open/closed) in order to add a state to the virtual pointer represented by the user hand.

In the BSc Thesis project of Lorenzo Usai we exploited the OpenNI library together with the NITE middleware to track the hands of multiple users. The depth imagery allowed us to obtain a precise segmentation of the user hands.

Segmented RGB hand images are normalized with respect to the orientation and a fast descriptor based on an adaptation of SURF features is extracted; we train an SVM classifier with ~31000 images of 8 different subjecs to recognize hand poses (open/close).

Right hand images classification at 1000-2000 cm distance
Right hand images classification at 1000-2000 cm distance

A Kalman filter is used at the end of our recognition pipeline to smooth the prediction results, removing peaks of rare occasional failures of the hand pose classifier. The resulting recognition systems run at 15 frames per second and has an accuracy of 97.97% (tested on data independent from the training set).

Open / close hand image sequence classification
Open / close hand image sequence classification

Lorenzo Usai thesis (Available only in italian)

About Lorenzo Seidenari

I’m currently a PhD student at University of Florence. My research interests are focused on application of pattern recognition and machine learning, computer vision specifically in the field of human activity recognition.

2 Responses to Kinect hand tracking and pose recognition

Leave a Reply

Your email address will not be published. Required fields are marked *