Spontaneous Facial Expression Recognition Based on Histogram of Oriented Gradients Descriptor

Automatically detecting facial expressions has become an important research area. It plays a significant role in security, human-computer interaction and health-care. Yet, earlier work focuses on posed facial expression. In this paper, we propose a spontaneous facial expression recognition method based on effective feature extraction and facial expression recognition for Micro Expression analysis. In feature extraction we used histogram of oriented gradients (HOG) descriptor to extract facial expression features. Expression recognition is performed by using a Support vector machine (SVM) classifier to recognize six emotions (happiness, anger, disgust, fear, sadness and surprise). Experiments show promising results of the proposed method with recognition accuracy of 95% on static images while 80% on videos.


Introduction
Decision making is a part of our day to-day life.Facial expression help humans perceive useful information; make decisions and give instant responses during social communication.It is easy to fake, and this leads to incorrect decisions.Unlike facial expressions spontaneous micro expressions are difficult to hide or fake because they are involuntary expressions shown on human faces according to emotions experienced.They are very brief, lasting only 1/25 to 1/5 of a second.Currently only highly trained individuals are able to distinguish them, but even with proper training the recognition accuracy is only 47%.Combining computer vision and behavioral sciences have large potential for developing a technology, which helps in that aspect.The major challenges are: the short duration of expressions needs a camera with high frame rates; detection of light changes in facial skin needs effective feature data extraction, representation and facial expression recognition algorithm.Facial expression analysis goes well back into the nineteenth century.Darwin published his work on expression of emotions in man and animals.He claimed that we cannot understand human emotional expression without understanding the emotional expression in animals.In 1971 Ekman and Friesen postulated six primary emotions that each has a distinctive content together with a unique facial expression.Based on the work published (Porter & Brinke, 2008;Ekman, 2009) micro expressions are the most important behavioral source of lying detection and it used for danger demeanor detection as will.
In this paper we propose a spontaneous facial expression analysis in which the face divided to specific regions, histogram oriented gradients calculated for each region to extract feature vector and apply the support vector machine to do classification.The system achieves very promising results that compare favorably with human accuracy and other methods of detection because we depend on effective feature extraction (HOG) which captures edge or gradient structure that is very characteristic of local shape, and it does so in a local representation with an easily controllable degree of invariance to local geometric and photometric transformations.

Related Work
Research, on spontaneous micro expressions, has progressed in two major directions psychology and computer vision.
The work done by psychologist Dr. Paul Ekman and his colleagues has become the milestone in the study of facial expression detection.After the discovery of the six universal emotions and their micro expressions Ekman and Friesen (1978), develop a tool for measuring facial movements namely Facial Action Coding System (FACS) the most widely used method in recognizing facial expressions.FACS encodes contraction of each facial muscle (stand alone as well as combined with other muscles) in terms of action unit.Each facial expression expressed as a combination of action units.Essa and Pentland (1995), develop two approaches (muscle and motion energy) for representing facial motion.The first approach is the peak muscle actuation template, it measure muscles actuations for many people making different expressions, and found that each expression can define a typical pattern of muscle actuation and use the result as template for recognition.The second one is a motion energy template, which generates a pattern of motion energy associated with each facial expression which represents how much movement there is at each place on the face.The overall success rate of detection was 78%.Shreve and Godavarthy (2011), proposed a method for the automatic spotting of facial expressions in long videos comprising of macro-and micro-expressions using spatio-temporal strain where they compute motion and strain separately.They compute motion using robust optical flow but to get a reliable solution they must take in consider that intensity of a point on a moving object remains constant across a pair of frames, and pixels in a small image window move with a similar velocity; for strain computation they used finite difference method.They achieve accuracy of 85% for macro-expressions and 74% for micro-expressions detection.

Proposed Algorithm
We proposed a method to recognize spontaneous micro expression.The primary input is a video sequence of the subject; we expect only one face at time.

Face Detection and Tracking
Face detection comes with its own challenges such as camera quality, illumination, facial hair, pose and rotation of the face.We detect and track the face as shown in Figure 2 by using Continuously Adaptive Mean shift (CAMshift) which based on the Mean shift algorithm.Camshift algorithm one of the promising methods of face tracking.
CAMshift algorithm consists of the following steps: 1) Select an initial location of the search window.
2) Create the color histogram within the search window to represent the face under test.
3) Calculate the color probability distribution of the region centered at the Mean Shift search window.4) Mean shift algorithm is performed on the area until suitable convergence.The zeroth moment and centroid coordinates are computed and stored.
5) For the next frame, center the search window at the mean location found in Step4 and set the window size to a function of the zeroth moment.Go to Step 3.
Mean shift algorithm calculates the area and the center of mass under the window using the following equations.
Zeroth moment: (1) First order moment: The center of mass (x c ,y c ) are then calculated using the following equation: 2) Orientation Binning.

Gradient Computation
HOG is a feature descriptor based on gradient orientation or edge direction so we apply sobel filter which computes the gradient magnitude and orientation to detect edges in the facial region image as shown in Figure 4 (b).
The gradient magnitude and orientation of each pixel calculated using Equations ( 5 The normalization process is done by Equation 7as in: Let v be the un-normalized descriptor vector and ϵ a small normalization constant to avoid division by zero.
Figure 5. Overlapped blocks with HOG visualization

Facial Expression Classification
The classification based on the supervised learning technique.We have used a set of training images of different expressions.After we apply our algorithm on training images we concatenated descriptor values in one vector and feed it to SVM with n-class classification and a linear kernel.

Experimental Results and Dissections
We used two sets of dataset.One is well-known The Cohn Kanade dataset which used in many researches and become a golden standard dataset, the other one collected from the internet randomly.

Datasets
Our database comprises of two datasets.

The Cohn Kanade Dataset
It is one of the famous public datasets.It used for research in automatic facial image analysis and synthesis and for perceptual studies.It consists of 486 FACS-coded sequences from 97 subjects.Subjects range in age from 18 to 30 years 65% was female; 15% were African-American and 20% Asian or Latin.Cohn Kanade is available in two versions CK and CK+.In our research we depend on version 2 CK+ which augmented the dataset further to include 593 sequences from 123 subjects.It includes both posed and non-posed (spontaneous) expressions, which is fully FACS coded.

Other Videos
Dr. Paul Ekman one of the famous researchers in micro-expression has mentioned several examples of micro-expressions.We have obtained some of those from the internet.
We prepared a dataset from the available portion to examine our recognition system.The proposed system trained on 10 different subjects each of which performed different facial expressions.It also tested in two phases, the first phase was on posed expression of another 10 subjects performing different expressions, and the second phase of testing was on the found videos that are not included in training.

Result
A summary of the results are given in Tables (1, 2).The proposed system can classify the spontaneous facial expressions (happy, sad, fear, angry, disgust and surprise in the two phases of testing.The results prove that the proposed system could classify happy, angry, disgust, sad and surprise in maximum rates but fear in minimum rate.It's due to the shorter distance between brows, eyes, nose and mouth; during fear expression the distance between them is almost equal to that in natural face but in other expression it has significant space changes which make them more detectable than fear.The system implemented in MS VC++ 2010 with opencv2.3.1 on core i5 windows 7 workstation.

Conclusion
Facial micro-expressions is a useful behavior source for danger demeanor detection, lie sign and hostile intent.In this paper we proposed a spontaneous facial expression recognition system for micro-expression analysis.We address the problems of detection and tracking a human face in videos and how to recognize spontaneous expressions presented in those faces .A histogram orientation gradient descriptor used for feature extraction and a support vector machine for classification.The proposed system tested on both static images and videos; acquired results of 95% recognition rate on static images and 80% recognition rate on videos.

Challenges and Future Work
In this work we proposed a system that deals with one frontal face with no rotation at a time.In future work we will handle more than one face and deals with rotated faces by adding an alignment step.We will also deal with other types of image sequences such as 3D images.
The proposed algorithm consists of three main phases as shown in Figure1: face detection and tracking; feature extraction; facial expression classification.

Figure 1 .
Figure 1.Block diagram of the proposed algorithm

Figure 2 .Figure 3 .
Figure 2. Face detection and tracking: (a) Example of input frame.(b) Face image

Table 1 .
Experiential results of confusion matrix for facial expression recognition in videos

Table 2 .
Experiential results of confusion matrix for facial expression recognition in posed expression