NIPS 2011 Big Learning - Algorithms, Systems, & Tools Workshop: Large-Scale Matrix...
Big Learning Workshop: Algorithms, Systems, and Tools for Learning at Scale at NIPS 2011
Invited Talk: Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent by Rainer Gemulla
Rainer Gemulla graduated from the Technische Universität Dresden in Germany in the area of database sampling. He is currently working as a senior researcher at the Max-Plack-Institut für Informatik in Saarbrücken, Germany.
Abstract: We provide a novel algorithm to approximately factor large matrices with millions of rows, millions of columns, and billions of nonzero elements. Our approach rests on stochastic gradient descent (SGD), an iterative stochastic optimization algorithm. Based on a novel "stratified'' variant of SGD, we obtain a new matrix-factorization algorithm, called DSGD, that can be fully distributed and run on web-scale datasets using, e.g., MapReduce. DSGD can handle a wide variety of matrix factorizations and has good scalability properties.