Stationary Features and Cat Detection

Posted in Conferences, Companies, Science on December 29, 2008



Most discriminative techniques for detecting instances from object categories in still images consist of looping over a partition of a pose space with dedicated binary classifiers. The efficiency of this strategy for a complex pose, i.e., for fine-grained descriptions, can be assessed by measuring the effect of sample size and pose resolution on accuracy and computation. Two conclusions emerge: (1) fragmenting the training data, which is inevitable in dealing with high in-class variation, severely reduces accuracy; (2) the computational cost at high resolution is prohibitive due to visiting a massive pose partition.

To overcome data-fragmentation we propose a novel framework centered on pose-indexed features which assign a response to a pair consisting of an image and a pose, and are designed to be stationary: the probability distribution of the response is always the same if an object is actually present. Such features allow for efficient, one-shot learning of pose-specific classifiers. To avoid expensive scene processing, we arrange these classifiers in a hierarchy based on nested partitions of the pose; as in previous work on coarse-to-fine search, this allows for efficient processing.

The hierarchy is then "folded" for training: all the classifiers at each level are derived from one base predictor learned from all the data. The hierarchy is "unfolded" for testing: parsing a scene amounts to examining increasingly finer object descriptions only when there is sufficient evidence for coarser ones. In this way, the detection results are equivalent to an exhaustive search at high resolution. We illustrate these ideas by detecting and localizing cats in highly cluttered greyscale scenes.

Joint work with Donald Geman

Speaker: Francois Fleuret
Dr. François Fleuret received the PhD degree in probability from the University of Paris VI in 2000 and the Habilitation degree in applied mathematics from University of Paris XIII in 2006.

After one year at the Department of Computer Science, University of Chicago, in 2000, he was hired as a full researcher at the French National Institute for Research in Computer Science and Control (INRIA). In 2004, he moved to the Computer Vision Laboratory at EPFL, where he spent three years, before joining the Idiap Research Institute in 2007 as a Senior Researcher in machine learning. His main research interests are at the interface between statistical modeling and machine learning, with an emphasis on algorithmic efficiency.

Google Tech Talks
October 31, 2008

Watch Video

Tags: Techtalks, Google, Conferences, Science, Computer Science, engEDU, Education, Google Tech Talks, classification, Companies