Measuring Deformation of Mesh Shapes
How different is a face from another? Meet the following
faces: "Mary," (top left) "Phil," (top right) "Jane," (lower
left) and "Paul" (lower right) (not necessarily their real names,
to protect the innocent).


A mesh of landmarks, or fiducial points, can
be formed on each face, e.g., by a Voronoi tessellation of the landmarks.
Then, we gauge the deformation of each of the corresponding triangles using
the sphericity measure, previously defined by Ansari and Delp and used
by Watunyuta. The sphericity of a pair of triangles is the ratio
of the geometric and arithmetic means of the singular values of the
affine transformation matrix.
In the following, we use Mary as the reference
mesh, and measure its deformation against the other three meshes. The degree
of deformation is illustrated by the size of the square on each triangle.


In each of these meshes, there are 23 nodes and 36
faces. The mesh generated using a Voronoi tessellation of the
nodes extracted from Mary is not symmetric along the vertical axis.
The right side of the mesh was changed by hand to match the left side,
resulting in a symmetric mesh.
There are two possible improvements. First, a
symmetric mesh can be made from the node locations of the average face
(see below). Second, currently there is a quadrilateral with corners
at the eyes and eyebrows along the nose. This area is partitioned
arbitarily into two triangles by an edge from the face's right eye to the
left eyebrow. Either we have to keep this area as a quadrilateral
or we can add another node. Interestingly, the mesh generated using
the average face, which is perfectly symmetric, has the quadrilateral
area partitioned both ways (see below).
We can warp an image by using texture mapping
with a deformed mesh.

In the following, we deform Mary's image to have
Phil's mesh. The original and the deformed images are shown top and bottom,
respectively.

The mesh used in deforming an image of a face is
slightly different from the mesh used in measuring the difference of two
faces. A mesh that is used to render an image must: (a) cover
the entire image, not just the face, and (b) have the face mesh embedded.
In practice, I extended the face mesh by adding the four corners of the
image as nodes. This adds an additional 12 faces to the mesh.
What is an average face? I used a database of
43 faces and extracted meshes from them. They were then left-right reversed
to provide an additional 43 meshes. The 86 meshes were normalized to have
the same height and the average mesh of them is shown below. The radius
of the circle at each mesh node is equal to one standard deviation of the
node locations.
In the following, the average mesh is shown using the tessellation
obtained from Mary (left) and from the average mesh directly
(right).
Any face can be morphed to the average mask. In the following,
we show Mary, Phil, Jane, and Paul morphed to the average mask.
From left to right, we show the original, the displacement vectors,
and the morphed image.
A caricature is an exaggeration of the specific features of a face.
If we take the displacement vectors that take an image mesh to an
average mesh, reverse them, we should get a mesh that exaggerates
the features of an image.
Shown below are the photorealistic caricatures of Mary, Phil, Jane, and Paul.
A useful technique for analyzing a collection of sampled data is
principal component analysis.
To calculate the principal components, we must normalize the
data. In our case, we need to align geometrically the features
and scale the faces. Hence, not only do we morph each of the faces
to the average mask, it must now have a standard location and a
standard size.
The mask covers 5970 pixels; hence
each of these images forms a 5970-dimensional vector by
concatenating the pixels. The
average of the four vectors is shown below.
A problem with averaging pictures taken under such different conditions
as lighting is that the images with stronger contrast
would make more of an impression in the average.
I histogram-equalized the four images to the same pallette.
Shown below are the average of the four (left), of
Mary and Jane (center), and of Paul and Phil (right).
The average of the database of 86 images is shown below.
We can normalize each image in the database by subtracting out the
average image; e.g.,
The difference image of course has positive and negative
values. For illustration purposes, we show the difference image biased
by a constant mid-tone gray.
I subtracted the average image from the 86 face images in our database.
These then form a set of points centered around the origin
in the 5970-dimensional space. The images are organized as a
5970 by 86 matrix, denoted as L. I then computed the
singular value decomposition on L; i.e., find orthogonal
matrices U, V, and the diagonal matrix
D,
so that
L = U D V'. The 5970 by 86 matrix
U then holds the 86 eigenvectors; the 86 entries along
the diagonal of D are the singular values.
The first 24 eigenvectors, constituted as eigenfaces, are shown below.
These eigenfaces span the subspace of the 86 images in our database;
i.e., any of the faces can be reconstructed (up to some residual errors)
by a linear combination of these images and the average face.
How many eigenfaces should we use in a reconstruction?
Take a look at the singular values.
Consider the matrix equation
L = U D V'.
The matrix equation says that each column of L
is a linear combination of the columns of U.
The weights are obtained from D V'.
If a term in D is particularly large, the weight for the
corresponding column in U is obviously larger. Looking
at the singular values shown above, we see that the first 8 or 9
weights are much larger than the rest. In fact, after the
first 21 or so, the weights slowly taper off to about 10% of
the largest weight.
How well does it work?
Meet Adam (below, left) and Jill (below, right), whose
pictures were in the database.

I projected both of these onto the subspace spanned by the first
24 eigenfaces.
Their coefficients are as below.
Since Adam and Jill do not in the least resemble each other, not
surprisingly, their coefficients are quite different.
I formed linear combinations of the first 12 and the 24 eigenfaces.
In the following, I show the original (left), reconstruction
using 12 eigenfaces (middle), and reconstruction using
24 eigenfaces (right).
Independent component analysis is a new method for decomposing a
collection of sample vectors. Unlike the eigenvectors that
are pairwise orthogonal, the independent components are statistically
independent. Somying Thainimit applied the independent component
analysis to the collection of 5970-dimensional
vectors. The first 24 independent components are as follow.
What else needs to be done? An incomplete list:
-
We are more accustomed to non-photorealistic caricatures. Also, we may
want to limit our number of exaggerated features; e.g, emphasize
Mary's smile but leave her face shape alone.
-
Suppose we have a canonical mesh and a list of fiducial
points (such as those extracted automatically by computer vision algorithms).
How do we match the fiducial points to the mesh nodes, thereby creating
a mesh for the extracted landmarks?
-
How to integrate the distributed sphericities into
a single measure for the whole face?
-
How to use localized sphericities to measure local
shape differences (e.g., eye, nose, left cheek, mouth, chin)?
-
Can we generate a ``local expert'' from a gallery
of meshes and then use it to guide searches for fiducial points from an
image, similary to the Elastic Bunch Graph matching method?
All digital images, except the first four of
this page, are copyright © 2001
by Henry Chu. All rights reserved.
Contact Henry Chu at cice@cacs.louisiana.edu
for more information.