Looking at People
There is a great need for programs that can describe what people are doing from video. This is difficult to do, because it is hard to identify and track people in video sequences, because we have no canonical vocabulary for describing what people are doing, and because phenomena such as aspect and individual variation greatly affect the appearance of what people are doing. Recent work in kinematic tracking has produced methods that can report the kinematic configuration of the body fairly accurately and fully automatically.
The problem of vocabulary is more difficult. I will discuss a generative activity model that allows activities to be assembled from a set of distinct spatial and temporal components. The models themselves are learned from labelled motion capture data and are assembled in a way that makes it possible to learn very complex finite automata without estimating large numbers of parameters. The advantage of such a model is that one can search videos for examples of activities specified with a simple query language, without possessing any example of the activity sought. In this case, aspect is dealt with by explicit 3D reasoning.
An alternative strategy for dealing with aspect and individual variation is to build discriminative methods applied to appearance features. The difficulty here is that activities look different when seen from different directions. I will describe recent methods that make it possible to transfer models --- that is, to learn a model of an activity from one view, then recognize it in a completely different view.
Speaker: David Forsyth
Google Tech Talks September 8, 2008
David Forsyth holds a BSc and an MSc in Electrical Engineering from the University of the Witwatersrand, Johannesburg, and an MA and D.Phil from Oxford University. He is currently a full professor at U. Illinois Urbana-Champaign, having served 10 years on the faculty at UC Berkeley. He has published over 100 papers on computer vision, computer graphics and machine learning. He served as program co-chair for IEEE Computer Vision and Pattern Recognition in 2000, general co-chair for CVPR 2006, program co-chair for ECCV 2008, and is a regular member of the program committee of all major international conferences on computer vision. He has received best paper awards at the International Conference on Computer Vision and at the European Conference on Computer Vision, and an IEEE Technical Achievement award.
His recent textbook, "Computer Vision: A Modern Approach" (joint with J. Ponce and published by Prentice Hall) is now widely adopted as a course text.