Distributed Deep Learning - ICRI-CI

Intel Collaborative Research Institute
for
Computational Intelligence
Distributed Deep Learning
Boris Ginzburg
May 5, 2015
Agenda
• Deep Learning - Phase-I
– Deep Learning @ Intel
– ICRI-CI academic impact
• Deep learning - Phase-II:
– Academic impact
2
Phase -1
DEEP LEARNING – PHASE I
3
Phase
Intro
-1
Deep Learning-I: @INTEL
2012: ICRI-CI was founded, focus on Architecture for Machine
Learning;
• 6/2012: Started research on HW accelerators for Neural
Networks (B. Ginzburg, D. BD Rubin, D. Vainbrand).
• 12/2012: Proposed HW accelerator for Convolutional NN
2013: Early path-finding with System LLT lab (R. Fishtein,
S.
Goffman, and R. Rappoport)
• 9/2013: Computer Vision Group founded
2014: Intel Labs started Deep Learning project
2015: Multiple teams working on products and research related
to DL and applications
4
Phase -1
Deep Learning-I: Academic impact
ICRI-CI accelerated Deep Learning research in Israel
– DL boot camps in TAU & Technion
– 52 graduate students (B. Ginzburg)
– Sponsored DL master class in TAU
– Over 300 attenders (organizers: L. Wolf, B. Ginzburg)
– Helped to establish DL labs in TAU and Technion
– DL in curriculum in TAU, HUJI, and Technion
ICRI-CI build partnership with Berkeley BVLC
– Build IA-optimized caffe version (openmp and fft branches)
– Work in progress on OpenCL and Xeon-phi versions
– Intel joined BVLC board of sponsors
5
DEEP LEARNING – PHASE II
6
Large Scale Deep Learning
Imagenet-1K:
– 1000 classes, 1.2 million images
– 75% accuracy (top-1) and 95%(top-5)
7
Phase-2
Very Large Scale Deep Learning
Phase-2
Imagenet-100K:
– 5/2015: 33,000 classes (“synsets”) with ~500 images/class,
with target 100,000 classes
– Google, Microsoft, and Baidu build internal distributed system
8
Deep Learning Phase-II: Goals
• Scale deep net architecture and algorithms to
100,000 classes
• Define reliable distributed system for super-fast
training of large deep nets
9
Phase-2
Very Large Scale Deep Learning
Phase-2
Major Challenges:
1. Theoretical: Scale deep net architecture and
algorithms to 100,000 classes
– Very large and Very deep NN (Google, Microsoft - billions of
parameters)
– Large ensemble of small specialized networks
2. Engineering: Build reliable distributed system
for super-fast training of large deep nets
– Distbelief (Google), Adam (Microsoft)
Our goal: Provide technology to leapfrog
DL performance – especially in data centers
10
Distributed Deep Learning: Phase-II
Phase-II
Phase-2
New Deep Learning Applications:
CV, video analytics, medical imaging, …
Open Source
Distributed Deep Learning Library
optimized for Intel Architecture (Intel Labs)
Novel Deep Learning
Architecture and Theory
11
Parallel and Distributed
Optimization Algorithms
for Deep Learning
Scalable and Robust
Distributed Systems for
Large Scale Deep Learning
Phase II: Theory & Foundations
Phase-2
New Deep Learning Applications:
CV, video analytics, medical imaging, …
Open Source
Distributed Deep Learning Library
optimized for Intel Architecture (Intel Labs)
Novel Deep Learning
Architecture and
Theory
Scalable and Robust
Distributed Systems for
Large Scale Deep
Learning
Author
Univ
Amnon Shahua
HUJI
SimNets: A Generalization of
Convolutional Networks
Naftali Tishby
HUJI
Optimal Deep Learning and the
Information Bottleneck principle
Amir Globerson
HUJI
Improper deep learning with kernels
Boaz Nadler
Shie Mannor
12
Parallel and Distributed
Optimization Algorithms
for Deep Learning
Title
Weizman Unsupervised and Semi-supervised
Ensemble Learning
Technion Outlier robust distributed learning
Phase II: Distributed Algorithms
Phase-2
New Deep Learning Applications:
CV, video analytics, medical imaging, …
Open Source
Distributed Deep Learning Library
optimized for Intel Architecture (Intel Labs)
Novel Deep Learning
Architecture and
Theory
Author
Ohad Shamir,
Nathan Srebro
Shai ShalevShwartz
Koby Crammer
13
Univ
Parallel and Distributed
Optimization Algorithms
for Deep Learning
Scalable and Robust
Distributed Systems for
Large Scale Deep
Learning
Title
Weizman Distributed Methods for Non-Convex
TTI
and Deep Learning
HUJI
New algorithms for distributed deep
learning
Technion Mega-classification for Deep Learning
Phase-2
Phase II: Applications
New Deep Learning Applications:
CV, video analytics, medical imaging, …
Open Source
Distributed Deep Learning Library
optimized for Intel Architecture (Intel Labs)
Novel Deep Learning
Architecture and
Theory
Author
Mark
Silberstein
14
Univ
Parallel and Distributed
Optimization Algorithms
for Deep Learning
Scalable and Robust
Distributed Systems for
Large Scale Deep
Learning
Title
Technion Distributed deep learning on Xeon-Phi
Phase-2
Phase II: Systems
New Deep Learning Applications:
CV, video analytics, medical imaging, …
Open Source
Distributed Deep Learning Library
optimized for Intel Architecture (Intel Labs)
Novel Deep Learning
Architecture and
Theory
Author
Lior Wolf
Univ
TAU
Parallel and Distributed
Optimization Algorithms
for Deep Learning
Scalable and Robust
Distributed Systems for
Large Scale Deep
Learning
Title
Scene understanding:
from image to text and from image
and a question to an answer
Hayit Greenspan
TAU
Applications of Deep Learning to
Medical Imaging
Technion Image restoration using deep learning
Michael
Zibulevsky
15
Phase-2
People
HUJI
Technion
T.Tishby, A.Shashua,
S.Shalev-Shwartz
A. Globerson*
K.Crammer, S.Mannor,
M.Zibulevsky,
M.Silberstein
Intel
B.Ginzburg
L.Shani
TAU
L.Wolf
H.Greenspan
16
Weizmann
B.Nadler,
O.Shamir
Today Talks
Deep Learning
Session
Chair: Boris Ginzburg
Boris Ginsburg
Distributed Deep Learning Library - Capstone overview
Naftali Tishby
Optimal Deep Learning and the Information Bottleneck Method
Michael Zibulevsky
Compressed sensing and computed tomography with deep learning
Lior Wolf
Automatic Image Annotation using Deep Learning and Fisher Vectors
Break
Amnon Shashua
Deep Layered SimNets
Shai Shalev Shwartz Rigorous algorithms for distributed deep learning
Shie Mannor
17
Outlier robust distributed learning