This article describes a new dataset under construction at the Media Integration and Communication Center and the University of Florence. The dataset consists of high-resolution 3D scans of human faces from
each subject, along with several video sequences of varying resolution and zoom level. Each subject is recorded in a controlled setting in HD video, then in a less-constrained (but still indoor) setting using a standard, PTZ surveillance camera, and finally in an unconstrained, outdoor environment with challenging conditions. In each sequence the subject is recorded at three levels of zoom. This dataset is being constructed specifically to support research on techniques that bridge the gap between 2D, appearance-based recognition techniques, and fully 3D approaches. It is designed to simulate, in a controlled fashion, realistic surveillance conditions and to probe the efficacy of exploiting 3D models in real scenarios.

High Definition of our Mesh
The acquisition process in our dataset




Each subject is composed from the following:

  • Four High Quality 3D Model (two frontal models once for test and one for training, one left side model, one right side model). In the case the subject wears glasses, we provide a 3D Model also with the glasses.
  • One HD Video (1280 x 720) in cooperative environment recorded at 4 levels of zoom. The subject here is asked to generate some out-of-plane head rotations, viewing six points: top-right,top-left, middle-right,middle-left,bottom-right,bottom-left. Frame rate 25 fps.
  • One Indoor Video (704×576 – 4CIF) from a PTZ  with 3 levels of zoom. Here the subject is asked to be spontaneous. Frame rate 25 fps.
  • One Outdoor Video (736×544) from a PTZ with 3 levels of zoom. Here the subject is asked to be spontaneous, but this time the recorded video is very challenging. Frame rate 5-7 fps.

Generally a subject file is about 400MB and it contains the four models in three different formats: OBJ, PLY and VRML.

Regarding the video they are video encoded in MJPEG.

The files for a subject are compressed with a tar.bz2 in a single tarball.

Please, if you use the dataset cite our papers as follows:

author = “Bagdanov, Andrew D. and Masi, Iacopo and Del Bimbo, Alberto”,
title = “The Florence 2D/3D Hybrid Face Datset”,
booktitle = “Proc. of ACM Multimedia Int.’l Workshop on Multimedia access to 3D Human Objects (MA3HO’11)”,
month = “December”,
year = “2011”,
publisher = “ACM Press”,
organization = “ACM”,

The dataset is maintained by Iacopo Masi, email: mas@dsi.unifi.it

