Announcements Post-‐final Schedule

Announcements §  Final prep page up §  Exam logis4cs: §  5/13, 11:30-­‐2:30pm §  RSF Fieldhouse §  Project 6 §  Op4onal (drop two lowest) §  Due 5/8 at 5pm §  Office hours (30+ hours total) §  Past exams §  Prac4ce Final (op4onal) §  1pt EC on final §  Due 5/9 Post-­‐final Schedule §  Wed 5/13 §  Final §  Thu 5/14 §  Grading §  Fri 5/15 §  Graded Final Exam available on Gradescope §  Tenta4vely finalized grade report available on Gradescope §  Sat 5/16 11:59pm §  Deadline for regrade requests + repor4ng grade report inaccuracies §  Sun 5/17 §  Regrades + grade report issues handled §  Mon 5/18 11:59pm §  Deadline to let us know if you didn’t hear back from us about regrade request / grade report issue §  Tue 5/19 §  Grades submi^ed CS 188: Ar4ficial Intelligence Conclusion Instructor: Pieter Abbeel -­‐-­‐-­‐ University of California, Berkeley [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at h^p://ai.berkeley.edu.] Contest Results P1 Mini-­‐Contest Results! P1 Mini-­‐Contest Results §  1st place: §  D’s Pacman – Shuheng Dai §  2nd place §  KnifeIntensive – Daylen Yang and Ravi Tadinada §  3rd place §  ballSoHard – Anthony Su and Peter Li P2 Mini-­‐Contest Results! P2 Mini-­‐Contest Results §  1st place: §  ᗧ • • • – Allan Zhao §  2nd place §  Doge – Kai Si and Denise Szeto §  3rd place §  50 Shades of Pacman – Anthony Su and Peter Li Final Contest Final Contest Results! §  Challenges: Long term strategy, mul4ple agents, adversarial u4li4es, uncertainty about other agents’ posi4ons, plans, etc. Agent Powers Speed
Food Capacity
Laser
Respawn Time
Invisibility
Power Pellets Juggernaut
Armor
Reveal
Grenade
Scare
Game Flow Power Selection
Noisy Observations
Final Contest Sta4s4cs §  30 teams, thousands of matches! §  Naming trends: §  Crea4ve names: DavisFooteIsDrunk ThePacHac All your baseline are belong to us Pac is Bac Derpbot w/Abbeel DLC Pack ($188) §  Clear intent name: §  Stanford (ended in last place) § 
§ 
§ 
§ 
§ 
§  Great work by everyone! §  Final results: now Pellet Selection
Top-­‐10 [1600] [1596] 4 [1538]
5 [1530] [1527] [1523] 6 [1515] 7 [1474] 8 [1466]
9 [1461] 10 [1459] FoodThieve+ FoodThieve TheFlash
Kamehameha blazeLine
Clueless About Powers
Pac is Bac
MySuperBot threeAndD Neapolitan BasePac
Staff Staff Xinyu Liu and Trevor Ta Alex Yang Staff Staff Sam Kumar and Paul Reed Bramsen Christopher Le and Liang Wang Albert Lin and Andy Sun Des4ne Lee and Jonathan Ting Yi-­‐Wen Liao and Chung-­‐Yen Lin Top Five Teams §  Top 5: Derpbot w/Abbeel DLC Pack ($188) Team Members: Joshua Mak and Derrick Hu §  Top 5: Doge 3 Team Members: Kai Si and Denise Szeto §  Top 5: Tautle Team Members: Jonathan Tau and Favian Ho §  Top 5: blazeLine Team Members: staff §  Top 5: Staffeinated Team Members: staff For (not) 3rd Place §  Derpbot w/Abbeel DLC Pack ($188) Team Members: Joshua Mak and Derrick Hu Both agents have 2 points on speed. The offensive agent has 2 points on invisibility while the defensive agent has 2 points on laser. If there are more than four capsules on the board: -­‐ Derpbot will spawn two juggernaut capsules: one near the spawn point and one on the same row near the center column -­‐ The offensive agent will use the juggernaut capsules to break a row of walls from one end to the other -­‐ The offensive agent will then collect five food pellets at a 4me -­‐ The defensive agent will move to the cleared out row and fire a laser across the board If there are only four capsules on the board: -­‐ Both agents will a^empt to retrieve food, collec4ng two food pellets at a 4me -­‐ If an opponent agent is currently collec4ng food pellets, the defensive agent will chase aser the pacman § 
§ 
§ 
§ 
VS §  Doge 3 Team Members: Kai Si and Denise Szeto Defensive agent protects our side by being slightly fast, slightly invisible, and carrying a short-­‐range laser. Slightly invisible offensive agent is super fast and respawns fairly quickly, so it's okay if it dies very osen. Didn't really worry about using the new power pellets. For 1st place §  Tautle Team Members: Jonathan Tau and Favian Ho § 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
One offensive and one defensive agent Both agents have the same powers: §  2 speed, 1 laser, 1 invis Because speed was the most overpowered, with laser and invis being equally important Defensive agent sprints to the middle and focuses on defending Whenever an enemy pacman comes to our side, our defensive agent will try to kill it by chasing it Offensive agent sprints to the other side and collects 4 pellets and then comes back, while trying to avoid enemy ghosts Once score higher than the opponent's, the offensive agent becomes a defensive agent and plays defense However, if losing, offensive agent will con4nue to try to get pellets from the other side back, 4 at a 4me Split up the defensive du4es of the bots so that one focuses more on the top side of the map while the other one tends to be on the bo^om side VS §  Doge 3 Team Members: Kai Si and Denise Szeto Our defensive agent protects our side by being slightly fast, slightly invisible, and carrying a short-­‐
range laser. Our slightly invisible offensive agent is super fast and respawns fairly quickly, so it's okay if it dies very osen. We didn't really worry about using the new power pellets. Students vs staff §  Tautle Team Members: Jonathan Tau and Favian Ho § 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
One offensive and one defensive agent Both agents have the same powers: §  2 speed, 1 laser, 1 invis Because speed was the most overpowered, with laser and invis being equally important Defensive agent sprints to the middle and focuses on defending Whenever an enemy pacman comes to our side, our defensive agent will try to kill it by chasing it Offensive agent sprints to the other side and collects 4 pellets and then comes back, while trying to avoid enemy ghosts Once score higher than the opponent's, the offensive agent becomes a defensive agent and plays defense However, if losing, offensive agent will con4nue to try to get pellets from the other side back, 4 at a 4me Split up the defensive du4es of the bots so that one focuses more on the top side of the map while the other one tends to be on the bo^om side VS §  Staffeinated Team Members: CS188 Staff 2 speed, 1 laser, 1 invis 2 capacity, 2 speed Secret CS188 sauce Top-­‐3 1 [1867] [1831] 2 [1772] [1743] 3 [1733] Tautle Staffeinated
Doge 3 blazeLine
Derpbot
Jonathan Tau and Favian Ho Staff Kai Si and Denise Szeto Staff Joshua Mak and Derrick Hu …and Congratula4ons to All! §  Amazing work by everyone §  You should all be proud of what you’ve accomplished! Ketrina Yim CS188 Ar4st Personal Robo4cs PR1 [VIDEO: pr1-­‐montage-­‐short.wmv] [Wyrobek, Berger, van der Loos, Salisbury, 2008] PR2 (autonomous) PR2 (autonomous) [VIDEO: 5pile_200x.mp4] [Mai4n-­‐Shepard, Cusumano-­‐ Towner, Lei, Abbeel, 2010] [VIDEO: sock_matching_IROS2011.mov] [Wang, Miller, Fritz, Darrell, Abbeel, 2011] Object Detec4on in Computer Vision n 
State-­‐of-­‐the-­‐art object detec4on un4l 2012: Input
Image
n 
Hand-engineered features
(SIFT, HOG, DAISY, …)
Support Vector
Machine (SVM)
“cat”
“dog”
“car” …
Krizhevsky, Sutskever, Hinton 2012 (also: Lecun, Bengio, Ng, Darrell, …): Input
Image
8-layer neural network with 60 million parameters to learn
n 
60 million learned parameters (since then, billions of parameters) n 
~1.2 million training images “cat”
“dog”
“car” …
Performance graph credit Matt
Zeiler, Clarifai
Performance Performance graph credit Matt
Zeiler, Clarifai
AlexNet
graph credit Matt
Zeiler, Clarifai
Performance AlexNet
Performance graph credit Matt
Zeiler, Clarifai
AlexNet
graph credit Matt
Zeiler, Clarifai
Speech Recogni4on graph credit Matt Zeiler, Clarifai
What’s Under the Hood? Σ >0? Σ >0? … Σ >0? Σ >0? Σ >0? … Σ >0? Σ >0? Σ >0? … Σ >0? f1 f2 f3 Σ History (Olshausen, 1996)
2000s Sparse, Probabilistic, and Energy models (Hinton, Bengio, LeCun, Ng)
Is deep learning 3, 30, or 60 years old?
Rosenblatt’s Perceptron
based on history by K. Cho
What’s Changed n 
Data n 
1.2M training examples n 
* 2048 (shiss) n 
* 90 (PCA re-­‐coloring) n 
1.2M * 2k *90 ~ 0.216 trillion n 
Human eye: 1k frames/s n 
Sigmoid à ReLU n 
à ~6.84yrs n 
Compute power n 
Two NVIDIA GTX 580 GPUs n 
5-­‐6 days of training 4me Nonlinearity Regulariza4on n 
Drop-­‐out n 
(Training data augmenta4on) n 
Explora4on of model structure n 
Op4miza4on know-­‐how Object Detec4on in Computer Vision n 
State-­‐of-­‐the-­‐art object detec4on un4l 2012: Input
Image
n 
Hand-engineered features
(SIFT, HOG, DAISY, …)
Support Vector
Machine (SVM)
“cat”
“dog”
“car” …
Krizhevsky, Sutskever, Hinton 2012 (also: Lecun, Bengio, Ng, Darrell, …): Input
Image
8-layer neural network with 60 million parameters to learn
n 
60 million learned parameters (since then, billions of parameters) n 
~1.2 million training images “cat”
“dog”
“car” …
Examples of RL in Robo4cs …
[Kohl and Stone, ICRA 2004] Number of parameters learned: tens [Tedrake, Zhang, Seung 2005] [Ng + al, ISER 2004] Addi4onal Challenges Compared to Supervised Learning n 
Much weaker supervision n 
Cost for being in a state but current state is consequence of ini4al state, ac4ons, and noise = Temporal credit assignment problem n 
Distribu4on over observed states determined by robot’s own ac4ons = Explora4on problem How About Real Robo4c Visuo-­‐Motor Skills? Example Tasks Architecture (92,000 parameters) [Levine*, Finn*, Darrell, A., 2015 TR at: 4nyurl.com/visuomotor] Policy Search Learned Skills Fron4ers / Limita4ons n 
Architectures for shared learning / transfer learning n 
Mul4ple robots and sensors (including simula4on) n 
Mul4ple tasks n 
Simula4on – Real world n 
Leverage simultaneously learning: dynamics, Q, policy n 
Explora4on beyond e-­‐greedy n 
Controllers that require memory / es4ma4on n 
Hierarchical policies Applica4ons: Manipula4on, Locomo4on, Vision-­‐based Flight, … Pac-­‐Man Beyond the Game! Pacman: Beyond Simula4on? Students at Colorado University: h^p://pacman.elstonj.com [VIDEO: Roomba Pacman.mp4] Pacman: Beyond Simula4on! Bugman? §  AI = Animal Intelligence? §  Wim van Eck at Leiden University §  Pacman controlled by a human §  Ghosts controlled by crickets §  Vibra4ons drive crickets toward or away from Pacman’s loca4on h^p://pong.hku.nl/~wim/bugman.htm Bugman [VIDEO: bugman_movie_1.mov] Where to Go Next? Where to go next? §  Congratula4ons, you’ve seen the basics of modern AI §  … and done some amazing work pu}ng it to use! §  How to con4nue: § 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
Machine learning: cs189, stat154 Intro to Data Science: CS194-­‐16 (Franklin) Probability: ee126, stat134 Op4miza4on: ee127 Cogni4ve modeling: cog sci 131 Machine learning theory: cs281a/b Vision: cs280 Robo4cs: cs287 NLP: cs288 … and more; ask if you’re interested That’s It! §  Help us out with some course evalua4ons §  Have a great summer, and always maximize your expected u4li4es!