Facial characteristics like Gender, Age, Ethnicity and face pose, are important to estimate in many computer vision applications. Estimating the Gender and Age can be used to adapt advertising displayed on nearby screens, and pose estimation can allow users to interact with devices by simply looking at them. Although head pose is not a biometric characteristic, it is related to the gaze of a person and therefore can be an important aspect of behaviour and social interaction understanding. Despite the attention received in recent years, estimation of multiple face characteristics, and especially multiple soft biometrics like Age, Gender and Ethnicity, remains a difficult problem and an active area of research in the computer vision community.
Most systems for estimating characteristics like Age, head pose and Gender the authors use their own sets of custom features and specific estimation techniques. This can be wasteful since much work (like feature extraction) is duplicated. Instead of estimating characteristics individually, we believe that a single system can be built in order to be able to perform a joint estimation of all these characteristics using a single pool of features and a single estimator. In this way, estimation of multiple characteristics can be made more efficient and more robust.
We believe that the usage of random decision forests can contribute to provide a unified framework for multi-objective estimation. In this project we show how they can be used to simultaneously estimate multiple characteristics using a single pool of features (e.g. Fig. 1). We propose a new information gain formulation enabling the use of multiple (potentially heterogeneous) characteristics to train a random forest. We demonstrate the effectiveness of the proposed approach for jointly estimating head pose, Gender, Age and Ethnicity from single face images. Figure 1 illustrates the main idea behind Multi-Objective Random Forests (MORF). Early levels of each random tree will tend to specialize on a subset of characteristics, effectively conditioning later levels on them.
Our proposed Multi-Objective Random Forests (MORF) framework is a unified model for the joint estimation of multiple characteristics that automatically adapts the measure used for evaluating the quality of weak learners. Since facial characteristics are related in the feature space, estimating all of them jointly can be beneficial as trees can learn to condition the estimation of some characteristics on others. We reformulate the splitting criterion of random trees in the our multi-objective formulation.
We define a new normalized measure of information gain for multi-objective random forests, the locally weighted information gain. It weights the information of each characteristic by the ratio between the local entropy in each node with respect to the root node entropy. The main idea behind the definition of the locally weighted information gain is to update weights, i.e. the ratio between the two entropy values, during the training process in order to scale each characteristic information gain based on how much entropy remains at the current depth. We evaluate it on publicly available face characteristic estimation imagery and obtain promising results (e.g. Figure 3).
Dario Di Fina, Svebor Karaman, Andrew D. Bagdanov, Alberto Del Bimbo, “MORF: Multi-Objective Random Forests for Face Characteristic Estimation”, Proc. of IEEE International Conference on Advanced Video and Signal based Surveillance (AVSS) – 2015