The Neural Computation Lab

Automatic face recognition

Esher

Face Recognition from Blurred Images
Seismic Data Classification Preprocessing for Cursive Word Recognition DARP/ONR Project (Mine Detection)


Face recognition   (With Ariel Tankus, Hezy Yeshurun, Danny Reisfeld)

This work combines techniques motivated by biological vision with state of the art artificial neural network recognition schemes to perform on line face recognition on large face datasets.
The first step involves image preprocessing based on ideas of Hezy Yeshurun and colleagues:

  • Face detection and normalization demo  (Ariel Tankus)
  • TAU 370-Faces Data-set   (Ariel Tankus)
  • The next step involves classification with a hybrid of unsupervised and supervised network as shown below. This architecture which is described in a 1993 paper combines unsupervised goals for extracting "important" features from objects with the goal of extracting features that are useful for the recognition task. This leads to more robust networks that lead to superior results for small amounts of training data. A more recent paper uses the automatic rescaling of faces based on the normalization discussed above.

    Hybrid Unsupervised/Supervised Architecture


    Face Recognition from Blurred Images   (With Inna Stainvas, Amiram Moshaiov)

    This project involves the study of different learning goals for artificial neural networks and their effects on performance in recognition of low-quality - blurred, partially occluded and lossy compressed face images.

    These learning rules may be represented by different information theoretic constraints, such as BCM, ICA, EM etc. We show however that a more mathematically simple reconstruction constraints achieve improved performance on original and corrupted inputs.

    Combined classification/reconstruction network architecture

    This architecture attempts to reconstruct the images and to classify them from the same low-dimensional (hidden) representation. Like the hybrid architecture presented above, it attempts to find more features than those that are needed for classification from a small set of training patterns and is thus more robust to image degradation. This specific architecture turns out to be superior to classical feed-forward architectures as well as to hybrid architectures with various information-theoretic motivated unsupervised feature extraction.
    The combined learning rule for the hidden layer units is a composition of the error back-propagation from the reconstruction layer and the recognition layer. The relative influence of each of the output layers is determined by a constant $\lambda$ which is supposed to represent the tradeoff between reconstruction and classification ability.

    Some examples of the normalized "clean" TAU data base

    Training is done either on clean or corrupted data, while the network is constrained to reconstruct the clean data (so as to generate features that are insensitive to blur or other image degradation).

    Images after Gaussian Blur and mean intensity removal

    Due to the compression via the small hidden layer, reconstructed images are not exact copies of the input. Reconstructed faces are robust to different types of occlusion. Below we show reconstructed faces for Difference of Gaussians DOG blurred images. The DOG filter is a bandpass filter that is supposed to enhance edges.

    Reconstruction by the hybrid NN

    The first image is a clean "caricature" face (mean removed). The next is the DOG filtered image, the next two faces are reconstruction of NN with lambda=0, lambda=0.25, the last face is reconstruction of ensemble of NNs, i.e. the average of reconstructed images of all NNs with different lambda parameters. (Lambda is the constant that determines the ratio between the classification task and the recognition task.)

    Types of Image Degradation Used


    From left to right: Difference of Gaussians DOG blur (band pass filter); Gaussian blur (low pass filter); Motion blur (in the diagonal direction) (low pass filter); High-pass filtered image; Salt and Pepper noise; Gaussian noise;

    Image restoration

    Denoising and deblurring may be done before classification of the corrupted images, However, it is well-known that image restoration is an ill-posed problem which may be unstable, i.e., the solution may be very sensitive to small perturbations.
    We have been testing the following image restoration methods:
  • Frequency domain
  • Wiener filter
  • Pseudo-inverse filter
  • Inverse filter (in the absence of noise)
  • Estimation of blurring and noise parameters via Iterative Methods
  • From left to right: clean image, Gaussian blurred image with std=2 and additive Gaussian noise, blind deconvolution, deconvolution with known blurring filter.


    Cursive Word Recognition   (With Tal Steinhertz, Ehud Rivlin)

    We are currently working on a development of a set of preprocessing algorithms that can be used with any off-line handwritten word recognition system. Each algorithm can be adapted to be used separately or as part of a complete preprocessing system.
    Following is a list of the algorithms developed:

  • Skew finding and correction. Can be applied to a stand alone word, a single line or a full text page.
  • Stroke width estimation.
  • Locating and fixing discontinuous strokes due to miss scanning problems.
  • Skeletonization including labeling the axis (regular strokes) and (singular strokes).
  • Pre-segmentation based on the skeleton obtained.
  • Slant angle finding and correction.
  • Currently under development: Recovery of incomplete loops, or lost loops due to blotting. Following is a short demonstration of some of the preprocessing developed. The first image represents a scanned cursive word image (beautiful) after going through binarization.

    Original image (left)   and Correcting for lost loops (right)

    As a first stage one should look for lost loops. Indeed, two lost loops were found as can be seen in the next image: a hidden loop that belongs to the 'a' character and a smaller one in the middle of the 'f'. As a by-product of this process we have also obtained a correct stroke wide estimation that can be used for further processing.

    The next left image presents the pseudo-skeleton of the word image. We use the term pseudo since we do not produce a skeleton that satisfies the mathematical definition of an object skeleton. However we have fulfilled our goals of preserving all meaningful strokes with their original properties such as direction, curvature if exists, length etc. Unfortunately we still have some noise and artifacts, and we are currently working on their reduction together with smoothing the edges of the resulting skeleton.

    The respective skeleton of the word image (left)   and Slant correction (right)

    The right image shows the previously extracted skelton after going through the important preprocessing of slant correction. Note that all ascenders and descenders that leaned in all previous images became erect. The slant angle is found through the skeleton, but can be used to correct the original image as well and therefore it is very powerful also as a stand alone algorithm.