PhD Position on Multimodal Semantic Spaces Available

One PhD position/studentship to study integrated text-vision semantic
spaces is available in the Language, Interaction and Computation track
of the 3-year PhD program offered by the Center for Mind/Brain Sciences
at the University of Trento (Italy):

The PhD program (start date: November 2010) is taught in English by an
international faculty. The Language, Interaction & Computation track is
organized by CLIC, an interdisciplinary group of researchers studying
verbal and non-verbal communication using both computational and
cognitive methods:

CLIC is part of the larger network of research labs focusing on Natural
Language Processing and related domains in the Trento region, that is
quickly becoming one of the areas with the highest concentration of NLP
researchers in Europe.

The studentship is sponsored by a Google Research Award, and the PhD
project will be carried out as a collaboration between CLIC members and
the Zurich Google Research team.

* Project Outline *

The automated measurement of semantic similarity (similarity in meaning)
between words/concepts through unsupervised statistical semantic space
models such as Latent Semantic Analysis or Topic Models has been a
success story in text mining (see Turney and Pantel, 2010, for a recent

Today, through the Web, we have access to huge amounts of documents that
contain both text and images. While the use of text to improve image
labeling and retrieval is an active and growing area of research (e.g,
Feng and Lapata, 2008, Moringen, 2008, Mathe et al., 2008, Hare et al.,
2008, Olivares et al., 2008, Wang et al., 2009), in this project we want
to go the other way around, and develop novel techniques to extract
multimodal semantic spaces from texts and images, in order to improve
the measurement of semantic similarity among words. On the one hand, it
has been shown (Baroni and Lenci, 2009) that text-extracted conceptual
descriptions are lacking exactly in those aspects (such as color, shape
and parts of objects) that are likely to be most salient in visual
depictions of the same objects. On the other, a recent trend in computer
vision is to represent images as vectors that record the occurrence, in
the analyzed image, of a discrete vocabulary of “visual words” (Yang et
al., 2007, and references there). This development paves the way to the
integration of visual word co-occurrence features into the classic
text-based vectors of current semantic space models.

The topic is expected to have a strong impact both on applied front, as
a breakthrough in the acquisition of large semantic repositories (we
will explore in particular applications to information retrieval), and
from a theoretical point of view, leading to “embodied” models of
computational learning that are more directly comparable to what human
learners do (Barsalou, 2008, Glenberg and Mehta, 2008).

* Application Information *

The successful candidate will have a strong computational background,
including familiarity with machine learning and/or statistical methods,
and should be familiar with the basics of either natural language
processing or (preferably) computer vision. An interest in exploring the
connections between artificial and natural intelligence and cognition is
also desirable.

The official call of the Doctoral School in Cognitive and Brain Sciences
will been announced shortly, and application details will be available
at the page:

We strongly encourage a preliminary expression of interest in the
project. Please contact Marco Baroni (marco.baroni(at), attaching
a CV in pdf or txt format, or a link to an online CV.