Thursday, April 15, 2010

Computer Vision Technologies



Reactions 

Introduction

Face recognition technologies or computer vision are part of active research and development. They are primarily part of bioinformatics. Throughout the years rapid progress has been made in this domain. Applications of face recognition in government domain are limitless and are required. In public domain too there are many businesses which can directly benefit from commercialization in these technologies.

Some examples:
  • Duplicate identity registration detection
  • Surveillance 
  • Robotics
  • Next generation UI
Almost limitless possibilities.

Available Solutions

Although it’s still a research domain many commercial and some open source products have been developed. Some are

http://www.neurotechnology.com – Commercial SDK costs ~$1000 for full version
http://opencv.willowgarage.com/wiki/ - Open Source. 40k active users.
http://www.sensiblevision.com/ - Commercial SDK. Enterprise solutions
http://www.pittpatt.com – Commercial SDK with web demos. Comprehensive capabilities

Google Picasa also provides face recognition now for tagging in images.

A Custom Implementation

As stated it’s a research domain and there are many approaches that have been explored to perfect this technology. From vector mathematics to machine learning to signal processing there are many algorithms to achieve face recognition. Every method has its own advantages. Best results are achieved by combining multiple methods.

Some known approaches are

Eigenface – College projects mostly
Hidden Markov Model
Dynamic Link Matching – Neural based

Depending upon which approach is used. Following data structures can be identified

For vector based approach

Various points are identified on a face and their distance and positions are identified as vector. So a face is converted into mathematical model using these approaches. Because data is identified as vector. Using various transformations an acceptable degree of recognition can be achieved.

Following data structure can be used to hold such data


{Float x, y, z} Point;
{Point x, y; int direction} Vector;
{List Vector} Connect;
{List Connect; int number} Face;

So a face will be representing by list of connections which are list of vectors, which are points connected and have direction.

For neural network approach

A neural network approach is more nearer to human vision. It is based on machine learning and may be well suited for targeted applications. For e.g. a neural network to identify goats can have different approach to identify patterns as compared to one which recognizes elephants. But a neural network approach can result in creating expert software for a set.

A neural network implementation is that of a rule based system which works over existent data with certain degree of acceptability.

There’s a function to evaluate the degree of confidence for a new data based on previous data. Data structure representing such system can be


{Face, Tag, Confidence}  Rule
{List Rule} Network


A function which will use network to identify classify Face

OpenCV has over 500 algorithms implemented. I would suggest using OpenCV as a base for further development or use one of the developed SDKs for implementing face recognition over developing custom solution from scratch.


System Components

  • Training module on customer computer: This module will allow a user to train the Face Recognition System (FRS from now) by providing list of images and name tag relationships. 
  • Face Recognition System: Based on the training provided, identity faces in video and image capture later
  • Notification Triggers: Actions to take when a new face is identified
Algorithm

Broadly
  1. Get the training data
  2. Store it in matrix, mathematical model
  3. Use machine learning approach to identify patterns later
OpenCV has highgui package which allows video and image capture and crunch it into Mat structure which is a matrix implementation with almost all matrix functions available. Ml package has machine learning algorithms implemented. MI algorithms can read this data and take decisions. Decision trees and neural network approaches are known ones. To get a better accuracy multiple approaches should be used.

References:

0 comments: