Fr i d a y, A p r i l 3 r d , 2 0 1 5 / 2 : 3 0 P M , D o h e r t y H a l l 2 2 1 0 "A New Platform for Cloud-based Distributed Machine Learning on Big Data" ABSTRACT In many modern applications such as web-scale content extraction via topic models, genomewide association mapping via sparse regression, and image understanding via deep neural networks, one needs to handle BIG machine learning (ML) problems that threaten to exceed the limit of current architectures and algorithms. While several new system frameworks beyond Hadoop, notably Spark and GraphLab, have emerged for parallelizing ML programs, good dialogs between system and ML remain difficult --- most system designs are agnostic to the distinctive characteristics of ML programs, treating them literally as operation sets as in traditional programs instead of iterative convergent procedures for optimizing a function, and hence ignore important properties thereof, such as error tolerance, non-uniform convergence, and structural coupling, which can fundamentally influence the priorities and goals for system design and open up new opportunities for improving efficiency. In this talk, I will discuss these opportunities and present a new framework, Petuum, for distributed machine learning that leverages these opportunities, and demonstrate how system innovations in light of ML-first principles lead to multiple orders of magnitude of scalability on a modest lab cluster for a wide range of large scale problems in text modeling (topic model with 1M topics), social network (mixed-membership inference on 100M node), personalized genome medicine (sparse regression on 100M dimensions), and computer vision (deep neural network with billions of parameters), with provable guarantee on correctness of distributed inference. Bio Eric Xing CMU-LTI Dr. Eric Xing is a Professor of Machine Learning in the School of Computer Science at Carnegie Mellon University, and the director of the CMU Center for Machine Learning and Health under the Pittsburgh Health Data Alliance. His principal research interests lie in the development of machine learning and statistical methodology; especially for solving problems involving automated learning, reasoning, and decision-making in high-dimensional, multimodal, and dynamic possible worlds in social and biological systems. Professor Xing received his Ph.D. in Computer Science from UC Berkeley. He is an associate editor of the Annals of Applied Statistics (AOAS), the Journal of American Statistical Association (JASA), the IEEE Transaction of Pattern Analysis and Machine Intelligence (PAMI), the PLoS Journal of Computational Biology, and an Action Editor of the Machine Learning Journal (MLJ), the Journal of Machine Learning Research (JMLR). He is a member of the DARPA Information Science and Technology (ISAT) Advisory Group, and a Program Chair of ICML 2014. * LTI colloquium: http://colloquium.lti.cs.cmu.edu Speaker webpage: http://www.cs.cmu.edu/~epxing Instructor: Alon Lavie Administrator: Benjamin Cook
© Copyright 2024