Data-Driven Approaches for Social Media Analysis and Annotation

time analysis.
tag refinement.

The large success of online social platforms for creation, sharing and tagging of user-generated media has lead to a strong interest by the multimedia and computer vision communities in research on methods and techniques for annotating and searching social media. Visual content similarity, geo-tags and tag co-occurrence, together with social connections and comments, can be exploited to perform tag suggestion as well as to perform content classification and clustering and enable more effective semantic indexing and retrieval of visual data.

However there is need to overcome the relatively low quality of these metadata: user produced tags and annotations are known to be ambiguous, imprecise and/or incomplete, excessively personalized and limited - and at the same time take into account the `webscale’ quantity of media and the fact that social network users continuously add new images and create new terms.

We also performed extensive and rigorous evaluation using two standard large-scale datasets to show that the performance of these methods is comparable with that of more complex and computationally intensive approaches. Differently from these latter approaches, nearest-neighbor methods can be applied to ‘web-scale’ data.

Here we make available the code and the metadata for NUS-WIDE-240K.

  • Evaluation Code (~ 8,5 GB, code + similarity matrices)
  • Nuswide-240K dataset metadata (JSON format, about 25MB). A subset of 238,251 images from NUS-WIDE-270K that we retrieved from Flickr with users data. Note that NUS is now releasing the full image set subject to an agreement and disclaimer form.

If you use this data, please cite:

  1. Uricchio, L. Ballan, M. Bertini, and A. Del Bimbo, “Data-Driven Approaches for Social Image and Video Tagging” in Multimedia Application Tools and Applications, accepted.
  1. Uricchio, L. Ballan, M. Bertini, and A. Del Bimbo, “An evaluation of nearest-neighbor methods for tag refinement” in Proc. of IEEE International Conference on Multimedia & Expo (ICME), San Jose, CA, USA, 2013.

For bugs and/or suggestions, please email me, thanks.