Real-Time Vision for Human Face Tracking
One important aspect of the overall system is the visual interface that
allows the human operator to control the robot using facial gestures
and the gaze point.
The approach we use is a 3 layered system shown in the figure below.
On the lowest level the vision system performs bitmap correlation in
hardware.
The results are measured feature positions that may contain tracking
errors.
The measured positions are forwarded to the 2-D model which takes into account
geometric constraints in the image plane and the correlation distortion
to generate an estimate of the feature positions.
This layer is implemented as a network of Kalman filters.
The estimated positions of features determine the location within the
next image frame of the hardware search windows.
The 2-D image positions of the features are transfered to the 3-D model
of the feature locations.
Using multiple feature triplets the 3-D pose of the head can be
determined and used for other calculations such as gesture recognition
or gaze point detection.
The 3-D model is also projected back into the image plane to adapt the
constraints in the 2-D model.
All three layers run at 30Hz.
Feedback & Queries: Jochen Heinzmann
Date Last Modified: Thursday, 24th Oct 1997