A Visual Interface for Human-Robot Interaction

Todays robots are rigid, insensitive machines used mostly in the manufacturing industry. Their interaction with humans is restricted to teach-and-play operations to simplify the programming of the desired trajectories. Once programmed gates to the robot cell must be shut and locked before the robots can begin their work. Safety usually means the strict separation of robots and humans. This is required due to the lack of sensors in the robots to detect humans in their environment and the closed-loop position control which will use maximum force to achieve the preprogrammed positions.
The exclusivity of humans and robots strongly limits the applications of robots. Only areas where the environment can be totally controlled are feasible for the use of robots. Also tasks must be completely achieved by the robot. Situations where the decision and planning capability of a supervisor is required must not arise since such help is not available due to the absence of humans.
Robotic systems that are designed to actually work together with a human would open a wide range of applications ranging from supporting high load handling systems in manufacturing and construction to systems dedicated for the interaction with humans like helping hands for disabled and elderly. Complex tasks and non-repetitive action sequences that will require supervision by human operators could be executed with the support of robots but guided by the operator (supervised autonomy). Such systems would need to have two main features that todays robots are lacking:

A natural human-robot interface that allows the operator to interact with the robot in a ``human-like'' way and

sensors and adequate control strategies that allow the safe operation of the robot in the presence of humans in its workspace.

Integration of the vision based human-robot interface and the robot control

The figure shows how the visual interface will be integrated with the robot control.
Our recent research has been focused on the human-robot interface side. Natural interaction consists of two major parts, natural language understanding and synthesis on one side and visual observation of the users body pose and gestures on the other side. The most important features of the human body that provide information about the intentions and desires of a user are the hands and the head, particularly the gaze point. In our earlier work we have shown that by observing the motion of the head, facial gestures can also be recognised. This project focuses on the real-time tracking and pose estimation of human heads in video sequences of head-and-shoulder images. The tracking is based on natural landmarks only like eyes, eyebrows and the mouth. Special emphasis was given to the real-time capability of the system. Utilizing a Fujitsu MEP Tracking System the face tracking runs at the NTSC frame rate of 30Hz. In parallel a gesture recognition process uses the motion and feature state vector of the tracking process to determine predefined features. The decomposition of gestures into atomic actions allows the simultanously recognition of more than 15 gestures at frame rate. This way also a confidence weighted prediction of the gesture currently performed is possible.
The aim of the project is to build a system that allows the control of a robot arm using facial gestures. Also direct interaction of the human with the robot arm is aspired. A "helping hand" for disabled or elderly supporting daily procedures like eating is one of the possible applications of such systems.

Feedback & Queries: Jochen Heinzmann
Date Last Modified: Thursday, 24th Oct 1997