BodyInterest Garments Dataset for Interest Recognition


The BodyInterest dataset has been collected to specifically address the task of garment body pose interest recognition. It consists of 900 videos of 30 users, 15 males and 15 females, looking at 160 garments, for a total of around 6 hours of recording. Users were instructed to look at garments and freely show their interest while their reaction was captured by a RGB camera positioned above. We also provide poses for each video extracted using the OpenPose pose detector.

Download Dataset

Body pose interest recognition is the task of assessing the user interest towards a garment by exploiting body pose information only. The users are asked to look at different garments and later express an interest level in the range of 0-5, which we named Degree of Interest (DoI). The users are free to express their interest in their most natural way, without any constraint.

We first collected 160 garments (80 male garments and 80 female garments) in two different formats: images and videos. Each format is shown to the user in a specific modality: slideshow and fashion show respectively. Each modality has been devised as follows:

  • Slideshows. This modality simulates the usual shop window where clothes are shown “statically”. It consists of videos made out of 2 or 3 still images of garments, where each garment is shown for 10 seconds. This is the simpler of the two modalities and represents the most common case. Each user is shown 10 videos containing 2 garments and 10 videos containing 3 garments, for a total of 20 slideshows.
  • Fashion shows. Differently from slideshows, this modality presents the user with videos of fashion shows, showing models walking the catwalk. In this scenario the user is able to appreciate more details than the simpler and “static” set of clothes that can appear in an ordinary shop window. Each user is shown 10 different fashion shows, each one featuring 4 garments.

We collected a total of 900 videos for a total of around 6 hours of recording. Acquisition has been done using an HD camera with 1280×720 resolution recording at 30fps. Each video is labeled with a DoI for each garment in the video for the duration of the garment on screen, for a total of 2700 garment annotations. Poses are also extracted from each video using the OpenPose pose extractor.

If you use this dataset, please cite our paper as follows:

author = {Wolmer Bigi, Claudio Baecchi and Alberto Del Bimbo},
title = {Automatic Interest Recognition from Posture and Behaviour},
booktitle = {MM ’20: Proceedings of the 28th ACM International Conference on Multimedia},
address = {New York, NY, USA},
publisher = {Association for Computing Machinery},
year = {2020}

There are no related projects