What Do Those Images Have In Common?

Posted in Conferences, Companies, Science on April 25, 2008

Google Tech Talks
March, 25 2008


This talk is about discovering and modeling previously unspecified, recurring themes in a given set of arbitrary images. Given a set of images, each containing frequent occurrences of objects from multiple categories, the goal is to learn a compact model of the categories as well as their relationships, for the purposes of later recognizing/segmenting any occurrences in new images. Categories are not defined by the user. Also, whether and where instances of any categories appear in a specific image is not known. This problem is challenging, since it involves the following unanswered questions. What is an object category? What image properties should be used and how to combine them to discover category occurrences? What is an efficient multicategory representation?

We will examine a methodology, developed during my postdoctoral work at UIUC. Each image is represented by a segmentation tree whose nodes correspond to image regions, segmented at all natural scales present, and edges between tree nodes capture the region embedding. The presence of any categories in the image set is then reflected in the frequent occurrence of similar subtrees within the segmentation trees. Our methodology is designed to: (1) match image trees to find similar subtrees; (2) discover categories by clustering similar subtrees, and use the properties of each cluster to learn the model of the associated category; and (3) learn the grammar of the discovered categories that compactly captures their recursive definitions in terms of other simpler (sub)categories and their relationships (e.g., containment, co- occurrence, and sharing of simple categories by more complex ones). When a new image is encountered, its segmentation tree is matched against the learned grammar to simultaneously recognize and segment all occurrences of the learned categories. This matching also provides a semantic explanation of object recognition in terms of the identified parts along with their spatial relationships.

The aforementioned methodology can also be used for identifying recurring image themes of more general kind. An example is that of extracting the stochastically repeating, elementary parts of image texture (e.g., waterlilies on the water surface, people in a crowd).

This talk will be taped by the engEDU Tech Talks Team.

Speaker: Sinisa Todorovic
Sinisa Todorovic received the joint B.S./M.S. degree with honors in electrical engineering from the University of Belgrade, Serbia, in 1994. From 1994 until 2001, he worked in the communications industry. He received the M.S. and Ph.D. degrees in electrical and computer engineering at the University of Florida, Gainesville, in 2002, and 2005, respectively. Since 2005, he holds the position of Postdoctoral Research Associate in the Beckman Institute at the University of Illinois Urbana-Champaign, where he collaborates with Prof. Narendra Ahuja. Sinisa's main research interests concern computer vision and machine learning, with current focus on unsupervised extraction and representation of visual themes recurring in images. He is the recipient of Jack Neubauer Best Paper Award 2004 for a publication in IEEE Trans. Vehicular Technology, and Outstanding Reviewer Award at the Int. Conf. on Computer Vision (ICCV) 2007. He serves as Associate Editor of Advances in Multimedia.

Watch Video

Tags: Techtalks, Google, Conferences, Science, Lectures, DSP, Computer Science, engEDU, Education, Google Tech Talks, Broadcasting, Companies