This page demonstrates that AAMs are applicable to various motion tracking applications. No explicit motion models have been enforced below, just simple frame-by-frame propagation. For optimal performance, motion models should be adapted.
Active Appearance Motion Models
Rigid object tracking in 3D using an AAM
To perform real-time tracking in 3D of a rigid object using a
low-cost web-cam (~30$), a PC and the AAM-API.
The project was done as a small exercise in testing the real-time
capabilities of Active Appearance Models. It should not be viewed upon as a
robust state-of-the-art tracking system, but merely as a demonstration of
performance and another proof of the general nature of AAMs.
It is thus not expected that this constellation in any way will outperform
conventional tracking techniques tailored to handle a perspective projection of
a planar object.
The training set consisted of five images of
a DAT tape cassette (the nearest object on my desk at the time being). The DAT
cassette was annotated using 12 landmarks. The training images were acquired
using the web-cam set to CIF (352x288) and subsequently resampled to QCIF
Fig. 1. The training images for the AAM Tracker. Each one was
annotated using 12 landmarks.
Upon the five training images shown in figure 1, a two-level multi-scale AAM was built. All images were converted to grey
scale byte prior to any processing by the AAM. The texture model consisted of
9100 pixels at level 0 and 2261 pixels at level 1.
The variance explained by the first three eigenvalues in the combined shape and
texture model were approximately 69%, 22% and 7%.
Modes of variation
1st Combined Mode
1st Shape Mode
1st Texture Mode
Initialisation was performed the first incoming frame by using a search-based
initialisation on level 1 of the multi-level AAM. The result was then propagated
down to level 0.
Tracking from frame to frame was accomplished simply by propagating the result of the previous frame to the current. Higher frame rate was traded for accuracy by limiting the maximum number of iterations to three. However, due to the accumulation of between-frame changes in pose and model parameters, the results should still converge using moderate movements of the object.
Tracking was performed by a Windows application on a live QCIF
(174x144) input from the web-cam. Four frames from an example tracking movie are given in figure
The tracker reached a performance of 7-10 frames/sec. No temporal filtering was
performed to increase the robustness of the tracking.
Fig. 2. Four frames from an MPEG4 movie
showing the AAM Tracker (49 sec. 498KB).
Web-cam input was provided by the Vision Wizard from the VisionSDK.
Tracking of a deformable object
Click here to see tracking results from an off-line AAM tracking experiment done by Martin Egholm Nielsen, Tue Lehn-Schiøler & Mark Wrobel in January 2001 using the AAM-API.
Dan Witzner is using the AAM-API to perform eye tracking. Click here to see preliminary tracking results.
The AAM Mickey
Morten Rufus Blas, Mads Fogtmann Hansen and Kasper Olesen used the AAM-API in spring 2002 to control a virtual character: the infamous AAM-Mickey(!)
Real-time tracking of tv-speakers
Jacob Overgaard Hansen, Steffen Holmslykke, Steffen Andersen and Søren Riisgaard used the AAM-API to produce tracking of a female and male tv-speaker. Later they coded their own optimised version of the AAM search, which provided real-time tracking (sequences are courtesy of tv2 and dr).
AAMM tracking of cardiac ultrasound images
Guillaume Chatelet and Eric Saloux (Ecole Nationale Superieure D'ingenieurs De Caen & Centre De Recherche – ENSICAEN) extended their copy of the AAM-API to encode the temporal statistics of echocardiogram sequences using the Active Appearance Motion Model (AAMM) proposed by Michell et al. The aim of this project is to study the cardiac contractility function by estimating various prognostic and therapeutic indexes from a localisation of the ventricle borders in ultrasound images. Click here to see an example of their results.
Using the AAM-API to convert speech to face movements
The parameters of an AAM of the face is controlled by features extracted from the sound. Feeding the model with recorded speech the output is videos sequences as seen in the examples
For more information see the home page of Tue Lehn-Schiøler.