Polite Robot: Visual Recognition of the Hand Using Learning in Depth

Technorama 2018

The goal of our project was to create a demonstration system in which a small robot sees the extended human hand, accepts it as a greeting and stretches out its own. Visual recognition of the hand – the main part of the system – proved to be a complicated task. Unlike many other gesture recognition studies, we did not use the information of depth, provided by sensors, such as Microsoft Kinect or temporary videos. We want to use simple hardware, so we created the recognition system from static images captured with video camera, and the stretching out of a hand as a way to greet someone is a static gesture.

In this work, we have taken on a completely new application, and created the solution from the beginning. We have collected a special data set for this task and had our team, friends and relatives pose in it. We can also offer a collection of data to other researchers.

With the help of advanced deep learning models and libraries (Keras, TensorFlow), we tested different convolutional neural network configurations and learning algorithms, monitored and studied the accuracy of trained models. Experiments have shown that it is important to use not only the machine learning models, algorithms and their parameters, but also the to input image transformations. The biggest breakthrough was achieved through a training sample, separating the observed person from the background. It presumably is because in this way, the machine learning model focuses on the person who is in front, and not on what happens in the background. It’s not easy for the prototype itself to do this without the information of depth or motion.

In this exhibition, we present a demonstration of an interactive robot system. Currently, the best results are received while using data that have a simple, monochrome natural background. Recognition accuracy with these validation data is 92%. In the near future, we will try to create a model that would recognize the proposed hand movements in more diverse environments even better. “Technorama” is a great opportunity to test the system with more different environments and people.

We hope that the demonstration we have created will help attract those interested in exact sciences, machine learning, robotics as well as young future researchers and professionals.

Duration:
2018 - 2018