Learning and Recognizing Visual Object Categories
Over the past few years there has been substantial progress in the development of techniques for recognizing generic categories of objects in images, such as automobiles, bicycles, airplanes, and human faces. Much of this progress can be traced to two underlying technical advances: # detectors for locally invariant features of an image, and # the application of techniques from machine learning. Despite recent successes, however, there are some fundamental concerns about methods that rely heavily on feature detection, because the local image evidence used in detection decisions is often highly ambiguous due to the absence of contextual information. We are taking a different approach to learning and recognizing visual object categories, in which there is no separate feature detection stage. In our approach, objects are modeled as local image patches with spring-like connections that constrain the spatial relations between patches. Such models are intuitively natural, and their use dates back over 30 years. Until recently such models were largely abandoned due to computational challenges that are addressed by our work. Our approach can be used to learn models from weakly labeled training data, without any specification of the location of objects or their parts. The recognition accuracy for such models is better than when using techniques based on feature detection that encode similar forms of spatial constraint.
Author: Dan Huttenlocher, Computer Science Department, Johnson Graduate School of Management