In this talk Dr. Oier Lopez de Lacalle introduces vSTS, a new dataset for measuring textual similarity of sentences using multimodal information. The dataset is comprised by images along with its respectively textual captions and allows to study whether better sentence representations can be built when having access to corresponding images, e.g. a caption and its image, in contrast with having access to the text alone.
We describe the dataset both quantitatively and qualitatively, and claim that it is a valid gold standard for measuring automatic multimodal textual similarity systems.
vSTS extends the existing semantic textual similarity tasks with images, and aims at being a standard dataset to test the contribution of visual information when evaluating sentence representations.
We also describe the initial experiments combining the multimodal information. In the experiments we show that the dataset allows to explore two hypothesis:
- H1) whether the image representations alone are able to predict caption similarity;
- H2) whether a combination of image and text representations allow to improve the text-only results on this similarity task.
Biography: Dr. Oier Lopez de Lacalle is a B.Eng in Informatics (2003), received his Ph.D. in Computer Science in 2009 from the University of the Basque Country (UPV / EHU).
He is currently working as a post-doctoral researcher in the MUSTER project in the IXA group of the UPV / EHU. Previously,
he was a researcher hired by Ikerbasque at the University of Edinburgh. In these years he has published more than 40 articles in
journals and international conferences in the area of Natural Language Processing (NLP) and Artificial Intelligence (AI).
His research focuses mainly on semantic processing and information extraction using probabilistic modeling and deep learning algorithms.
The contributions various publications in JCR journals and high-level congresses (CORE A+). He is a regular reviewer in international journals and participates in international congress committee programs such as EACL, NAACL, ACL, EMNLP, and IJCNLP.