Videos tagged with Computer Vision
Google Tech Talk August 15, 2011 Presented by Steve Seitz ABSTRACT 3D Computer Vision: Past, Present, and Future
Grouping Using Factor Graphs: an Approach for Finding Text with a Camera Phone
We introduce a new framework for feature grouping based on factor graphs, which are graphical models that encode interactions among arbitrary numbers of random variables. The ability of factor graphs to express interactions higher than pairwise order (the highest order encountered in most graphical models used in computer vision) is useful for modeling a variety of pattern recognition problems....
Adaptive Feature Selection in Image Segmentation
Most practical image segmentation algorithms optimize some mathematical similarity criterion derived from several low-level image features. One possible way of combining different types of features, e.g. color- and texture features on different scales and/or different orientations, is to simply stack all the individual measurements into one high-dimensional feature vector. Due to the nature of ...
Visual Categorization with Bags of Keypoints
We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naïve...
IM2GPS: Estimating geographic information from a single image
Estimating geographic information from an image is an excellent, difficult high-level computer vision problem whose time has come. The emergence of vast amounts of geographically-calibrated image data is a great reason for computer vision to start looking globally on the scale of the entire planet! In this paper, we propose a simple algorithm for estimating a distribution over geographic locati...
Looking at People
There is a great need for programs that can describe what people are doing from video. This is difficult to do, because it is hard to identify and track people in video sequences, because we have no canonical vocabulary for describing what people are doing, and because phenomena such as aspect and individual variation greatly affect the appearance of what people are doing. Recent work in kinema...
Generative Models for Visual Objects and Object Recognition via Bayesian Inference
Lecture slides: Generative Models for Visual Objects and Object Recognition via Bayesian Inference Plato said… How many object categories are there? So what does object recognition involve? Verification: is that bus? Detection: are there cars? Identification: is this a picture of Mao? Object categorization Scene and context categorization Challenges 1: view point variation Challenges 2: ...
Qualitative Spatial Relationships for Image Interpretation by using Semantic Graph
In this paper, a new way to express complex spatial relations is proposed in order to integrate them in a Constraint Satisfaction Problem with bilevel constraints. These constraints allow to build semantic graphs, which can describe more precisely the spatial relations between subparts of a composite object that we look for in an image. For example, it allows to express complex spatial relation...
Facial expression recognition and emotion recognition from speech
The presentation tackles the problem of recognizing the emotions based on video and audio data analysis. A fully automatic facial expression recognition system is based on three components: face detection, facial characteristic point extraction and classification. Face detection is employed by boosting simple rectangle Haar-like features that give a decent representation of the face. These feat...
Learning Sprites
A simple and efficient way to model much image and video data is to decompose it into a set of 2-dimensional objects in layers. Each object is characterized by its shape and appearance (as with a "sprite" in computer graphics). Following earlier work on layer decompositions in computer vision (e.g. Wang and Adelson, 1994), Frey and Jojic (1999) stated the sprite-learning problem in te...