From CATS to SAT: Modeling Empirical Hardness to Understand

Practical course
“Bayesian Optimization”
(also known as “Parameter Tuning
and Algorithm Configuration”)
Lectures: Frank Hutter
Exercises & Supervision of Projects:
Katharina Eggensperger, Stefan Falkner,
Matthias Feurer, Aaron Klein, Marius Lindauer
Overview of Today
•
•
•
•
Practical information
Learning goals
Bayesian optimization in a nutshell
Hands-on sessions
– First steps (Katharina)
– Setting up virtual box (Stefan)
– Plotting (Aaron)
2
Practical Information
• Room: Building 074, MST Pool
• Time: Mondays, 14:15-15:45
• Weeks 1-6: Basics (implement everything from scratch)
– Reading, a few lectures, and implementing what you learned
– Exercise sheets in weeks 1-4, each with a 2-week deadline.
Exercises count 50% of the grade.
• Weeks 7-14: Project (implement research from a paper)
–
–
–
–
Week 7: overview of papers
Week 9: Short student presentations on one paper each
Week 14: Hand in project code & short report; student presentations.
Project counts 50% of the grade (equal parts for paper presentation,
project code, project presentation, project report)
3
Learning Goals
• After this course, you can …
– Derive & implement Gaussian process regression from scratch
– Derive & implement Bayesian optimization from scratch
– Implement a new facet of Bayesian optimization
from a research paper
– Effectively use Python to program mathematical software,
carry out experiments with it and plot their results
– Effectively use state-of-the-art
hyperparameter optimization methods
• You will also practice various soft-skills:
– Short Presentations, team work, report writing, ...
4
Bayesian Optimization in a Nutshell
• Prominent approach to
optimize expensive blackbox
functions [Mockus et al., '78]
max f(x)
x X
x
f(x)
• Efficient in the number of
function evaluations
• Works when objective is
nonconvex, noisy,
has unknown derivatives, etc
• Recent convergence results
[Srinivas et al, '10; Bull '11;
de Freitas, Smola, Zoghi, '12]
5
Why is Bayesian Optimization Interesting?
• Currently leading approach for hyperparameter
optimization in machine learning
• Can be used for automatic machine learning
– Feature Selection
– Selection of Machine Learning Algorithm
– Hyperparameter Optimization
 Effective machine learning off-the-shelf
• Underlying approach for general algorithm configuration
– Adjust algorithm parameters to gain speed, accuracy, etc
– If you have not encountered this problem, trust me: you will 
6
Coming up next …
• The first exercise sheet is out today (due in 2 weeks)
• Next Monday: lecture on Gaussian processes
• I will follow the excellent book “Gaussian Processes for
Machine Learning” by Carl Edward Rasmussen and
Christopher K. I. Williams.
• Before the lecture, read sections 2.0-2.3 (15 pages) in
the book. A free copy is available online:
http://www.gaussianprocess.org/gpml/chapters/RW2.p
df
7