Topic Models Applied to Online News and Reviews

Posted in Companies, Conferences on June 24, 2012



Google Tech Talk
August 11, 2010

ABSTRACT

Presented by Alice Oh.

Probabilistic topic models are useful for uncovering the underlying semantic structure of a collection of documents. We take a simple and widely used topic model, the Latent Dirichlet Allocation (LDA, Blei et al. 2003) and extend it in two ways. First, we construct topic chains to understand the dynamic semantic structure of an online news corpus, and second, we develop a unification model of sentiment and aspect to discover the detailed semantics of user generated reviews.

For the topic chains research, we present a framework for comparing and clustering the topics discovered by LDA. We discuss how to interpret the resulting topic chains to understand the general topic trends, temporary issues, and the focus shifts within the topic chains. We applied the topic chains framework to 9 months of online news and present the results.

For the aspect-sentiment unification model (ASUM) research, we base our model on an observation that users evaluate various aspects of a product in a review, such as the lens of the camera or the LCD display of a laptop, and each sentence usually represents one aspect and a corresponding sentiment toward that aspect. So we propose a sentence-LDA (SLDA) with a constraint that all words in a single sentence are generated from one aspect. Then we extend SLDA to Aspect and Sentiment Unification Model (ASUM) to jointly discover pairs of {aspect, sentiment} which we call senti-aspects. We applied SLDA and ASUM to reviews of electronic devices and restaurants and present the results.

Alice Oh is an Assistant Professor of Computer Science at Korea Advanced Institute of Science and Technology. She leads her research group, Users and Information Lab, with the vision of delivering information to satisfy the user. To that end, she studies and employs methods from machine learning, human-computer interaction, and statistical natural language processing. Alice completed her M.S. in Language and Information Technologies at CMU and her Ph.D. in Computer Science at MIT.

Watch Video

Tags: Google, tech, Talk, machine, learning, GoogleTechTalks