UNDERSTANDING THE WORLD, BY LEARNING HOW TO MODEL IT - DEEP LEARNING @ HDM 2018 - CCC STUTTGART

Page created by Eduardo Francis

Arts & Entertainment

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

UNDERSTANDING THE WORLD, BY LEARNING HOW TO MODEL IT - DEEP LEARNING @ HDM 2018 - CCC STUTTGART

UNDERSTANDING THE WORLD, BY LEARNING HOW TO MODEL IT
                                       Deep Learning @ HdM 2018

                                                   www.hdm-stuttgart.de

About me

› Johannes Theodoridis

› Audiovisuelle Medien @ HdM
› Computer Science and Media @ HdM
› Exchange @ KTH Stockholm                                                                         deepart.io

› Currently working with Johannes Maucher on AI and ML @ HdM
›     Email: theodoridis@hdm-stuttgart.de

                                                                    (Image first slide: https://i.redd.it/2ag4n25oq02y.jpg)
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                             2

What do you do?

                                                                    IRGEN
                                                                    DWAS
                                                                    MITM
                                                                    EDIEN

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018           3

What do you do?

                                                                    IRGEN
                                                                    DWAS
                                                                    MITM
                                                                    EDIEN

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018           4

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018   5

What today is not about

                                                                    But don’t be fooled!
                                                                    Details matter in Deep Learning.

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                      6

2017 in AI: Poker

› Brains Vs. AI - January 2017 @ Rivers Casino Pittsburgh
› AI wins 20-day Heads-up, No-Limit Texas Holdém
  tournament against 4 top-class human poker players.

› ~ 10ˆ161 different decision points in Texas hold’em.                                               Libratus: The Superhuman AI for
› Infeasible to pre-compute a strategy for each of the                                               No-Limit Poker
                                                                                                     [Brown, Sandholm – IJCAI 2017]
  moves.

                                                                                                  Name                    Rank          Results (in chips)

                                                                                         Dong Kim                 1                   -$85,649
      "I didn’t realize how good it was until today. I felt like I
      was playing against someone who was cheating, like it                              Daniel MacAulay          2                   -$277,657
      could see my cards. I’m not accusing it of cheating. It                            Jimmy Chou               3                   -$522,857
      was just that good.” – Dong Kim
                                                                                         Jason Les                4                   -$880,087
      (Source: https://www.wired.com/2017/01/ai-conquer-poker-not-without-human-help/)
                                                                                         Total:                                       -$1,766,250

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                                       7

2017 in AI: Board Games

› 2016 AlphaGO
                 Mastering the game of Go with deep
                 neural networks and tree search
                 [Silver et al. – Nature 2016]

› learned from expert games + selfplay
› defeats Lee Sedol (world champion) 4:1

› 2017 AlphaGo Zero
                 Mastering the game of Go without
                 human knowledge
                 [Silver et al. – Nature 2017]

› learned entirely on ist own
› defeats AlphaGo 5:0
                                                                    (Credit: Photo courtesy of Google)
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                        8

2017 in AI: Video Games

› 2015
                              Human-level control through
                              deep reinforcement learning
                              [Mnih et al. – Nature 2015]

› 2017

                              OpenAI bot wins 1vs1 against Dendi
                              in a best-of-three match.
                              https://blog.openai.com/dota-2/
                              https://blog.openai.com/more-on-dota-2/
                              https://openai.com/the-international/

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018       9

2018 in AI: Video Games

› August 5, 2018

                                       OpenAI Five wins 2 out of 3
                                       games against a Semi-Pro Team
                                       https://blog.openai.com/openai-five/

› Long time horizons: ~ 20000 Moves (Chess ~ 40, Go ~ 150)
› Action Space: ~1000 valid actions each tick (Chess ~35, Go ~250)
› Observation Space: 20,000 numbers representing all game information (Chess 70, Go 400)

› Learned via self play: “OpenAI Five plays 180 years worth of games against itself every day.“
› Hardware: Training is running on 256 GPUs and 128,000 CPU cores.                                Images: blog.openai.com/

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                            10

2017 in AI: Healthcare

› Dermatologist-level classification of skin
  cancer with deep neural networks
      [Esteva et al. – Nature 2017]

› Trained on 129,450 clinical images
› Performance on par when tested against
  21 board-certified dermatologists

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018   11

2018 in AI: Healthcare

› April 11, 2018 - FDA Permits Marketing of
  First AI-based Medical Device: IDx – DR.

› Diagnostic system that autonomously
  analyzes images of the retina for signs of
  diabetic retinopathy.

› “Machines can help the doctor make a
  better diagnosis, but they are not good at
  making medical decisions afterward.”
      [EyeNet: Artificial Intelligence: The Next Step in Diagnostics - American
      Academy of Ophthalmology (AAO), Nov 2017]

                                                                                  Source: https://www.eyediagnosis.net

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                        12

2017 in AI: Systems

› The Case for Learned Index Structures
      [Kraska et al. – arxiv 1712.01208]

› Replace B-Trees-Index or Hash-Index with
  a Neural Network
› + 70% in speed
› + saving an order-of-magnitude in
  memory (over several real-world data sets)

› Authors argue that “replacing core
  components of a data management
  system through learned models has far
  reaching implications for future systems
  designs”

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018   13

Wait what?

› “I have a terrible confession to make. AI systems today suck“
         Yann LeCun at Brown University 2017

› “All of these AI systems we see, none of them is ‘real‘ AI“
         Josh Tennenbaum at CCN 2017

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018   14

A rough distinction

› Strong AI (or Artificial General Intelligence AGI) - can solve every task.
         This is what everyone is worried about in the media, Singularity etc. but, we are not even close!

› Weak AI (or narrow AI) – can solve a specific task.
         This is everything you have seen so far. Works really well for some tasks like image and speech recognition.

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                       15

Why are we “not even close“ to AGI?

› The brain learns with an efficiency that none of our machine learning methods can match.

› Our supervised learning systems require large numbers of examples.
› Our reinforcement learning systems require millions of trials.

› That is why we don‘t have robots that are as agile as a cat or a rat.
› That is why we don‘t have dialog systems that have common sense.

› What is missing?
› Learning paradigms that build (predictive) models of the world through observation and action.

Slide copied from: Dr. Yann LeCun, "How Could Machines Learn as Efficiently as Animals and Humans?"
https://www.youtube.com/watch?v=uYwH4TSdVYs
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                     16

What is Machine Learning?

› Machine Learning is the subfield of artificial intelligence
  concerned with programs that learn from experience.
                                                [Russell and Norvig - Artificial intelligence: a modern approach]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                   17

What is Machine Learning?

› Task: Tell if there is an apple in the image

                            Approach 1: write code                               Approach2: learn from data
                             def contains_apple(image)

                                red_pixels = count(image.RED)

                                if red_pixels > 300:
                                  return True
                                else
                                  return False                                           Machine Learning

                                        YES                 NO                            YES       NO

                                        Does not scale              Does scale: With enough compute power and training samples

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                18

What is Deep Learning?
› Traditional Pattern Recognition: Fixed/Handcrafted Feature Extractor

                                                         Feature                                      Trainable
                                                        Extractor                                     Classifier

› Deep Learning: Representations are hierarchical and trained

                                                       Low-Level    Mid-Level   High-Level            Trainable
                                                        Features    Features     Features             Classifier

                                                                                             Understanding Neural Networks
                                                                                             Through Deep Visualization
                                                                                             [Yosinski et al. – ICML 2015]
Slide Credit: Yann LeCun
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                        19

How do we train these things?

    Training Data – Labeled by category
      Label: Fruits

                                                             Select a random                                               Calculate the error by comparing
                                                             mini-batch of data                                            predicted and true labels
                                                                                             Predict Labels P

                                                                                                                                     P           T

      Label: Vehicles

                                                                                                                                         Error
                                                                                  Update the pipeline towards less error

›     Because of the labels we call this                              SUPERVISED LEARNING.
›     These labels need to be generated somehow (by humans mostly).
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                                    20

What is in the boxes?
                                                                    Convolutional Neural Network – CNN
                                                                    (note: no pooling layers in this architecture)

                  Input:                                                                                             Output:
                  Current game screen                                                                                Best action to choose

› CNN architecture that was used by [Mnih et al. – Nature 2015]
  to play Atari Games (Deep Q-Networks - DQN)

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                            21

A bit of CNN history: Thank you cats :)

›     RECEPTIVE FIELDS, BINOCULAR
      INTERACTION AND FUNCTIONAL                                    LeNet-5
      ARCHITECTURE IN THE CAT'S VISUAL                              [LeCun, Bengio, Haffner 1998]
      CORTEX [Hubel & Wiesel 1962]                                                                                                            Deep Learning

                                              Neocognitron
                                              [Fukushima 1980]
                                                                                                             AlexNet
      (Photo by Bertil Videt CC BY-SA 3.0)                                                                   [Krizhevsky, Sutskever, Hinton 2012]

                                                                                 Large Scale Visual Recognition
                                                                                 Challenge (ILSVRC)

›     ½ Nobel Prize in Physiology or Medicine 1981: David H. Hubel and Torsten
      N. Wiesel "for their discoveries concerning information processing in the
      visual system".
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                                        22

What does “deep“ mean?
              VGG

                                                                                                          conv 128

                                                                                                                     conv 128

                                                                                                                                          conv 256

                                                                                                                                                     conv 256

                                                                                                                                                                          conv 512

                                                                                                                                                                                     conv 512

                                                                                                                                                                                                conv 512

                                                                                                                                                                                                           conv 512

                                                                                                                                                                                                                                conv 512

                                                                                                                                                                                                                                           conv 512

                                                                                                                                                                                                                                                      conv 512

                                                                                                                                                                                                                                                                 conv 512
                                                                                                maxpool

                                                                                                                                maxpool

                                                                                                                                                                maxpool

                                                                                                                                                                                                                      maxpool

                                                                                                                                                                                                                                                                            maxpool

                                                                                                                                                                                                                                                                                                                    softmax
                                                                                                                                                                                                                                                                                      FC 4096
                                                                                                                                                                                                                                                                                                FC 4096
                                                                                                                                                                                                                                                                                                          FC 1000
                                                                            conv 64

                                                                                      conv 64
                                                                    Input
              [Simonyan, Zisserman 2014]

              GoogLeNet
              [Szegedy et al. 2014]

              ResNet
              [He et al. 2015]

              DenseNet
              [Huang et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                                                                                                                                                                                            Slide Credit: Yann LeCun   23

Supervised Learning
› Image Classification                                      Image Retrieval

                                                                                               ImageNet Classification with Deep
                                                                                               Convolutional Neural Networks
                                                                                               [Krizhevsky, Sutskever, Hinton 2012]

› Machine Translation

                                                                                               Convolutional Sequence to Sequence
 English: ”They agree”                                              German: ”Sie stimmen zu”   Learning
                                                                                               [Gehring et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                     24

Supervised Learning
› Image Caption Generation

                                                                    Show, Attend and Tell: Neural
                                                                    Image Caption Generation with
                                                                    Visual Attention
                                                                    [Xu et al. 2015]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                   25

Supervised Learning
› Instance Segmentation

                                                                    Mask R-CNN
                                                                    [He et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                      26

Supervised Learning
› Instance Segmentation in traffic

                                                                                                                                      Mask R-CNN
                                                                                                                                      [He et al. 2017]
                                                                    (Source: 4K Mask RCNN COCO Object detection and segmentation #2
                                                                    https://www.youtube.com/watch?v=OOT3UIXZztE )

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                                  27

Supervised Learning
› Pose Estimation

                                                                    Mask R-CNN
                                                                    [He et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                      28

Reinforcement Learning
› Play SNES games (Bachelor Thesis @ HdM )                          Learn Locomotion Behaviours @ DeepMind

                                                                     Emergence of Locomotion Behaviours in Rich Environments
                                                                     [Heess et al. 2017] (Video: https://www.youtube.com/watch?v=hx_bgoTF7bs)

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                               29

What are we missing?

                                                                    › Obstacles to AI
                                                                       › Learning models of the world
                                                                       › Learning to reason and plan

                                                                       Yann LeCun at CCN 2017
                                                                       (but he made this point in many talks)

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                               30

Common Sense Knowledge

› Image Caption Fails.

› The teddy doesn't fit into the brown suitcase because it's too
  [small/large]. What is too [small/large]?
  Answers:The suitcase/the teddy. (Winograd Schemas)

› ”Tom picked up his bag and left the room”.

› These questions are easy for us because we have a model of the
  world.
                                                                    (Sources: https://techcrunch.com/2016/11/08/shining-light-on-facebooks-ai-strategy/ ,                     31
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
                                                                    http://www.reactiongifs.com/wp-content/uploads/2013/02/nwld.gif , http://images.memes.com/meme/999039 )

Common Sense Knowledge
› Common Sense is the ability to fill in the blanks
› Filling in the visual field at the retinal blind spot.
› Filling in occluded images, missing segments in speech.

› Intuitive Physics + Intuitive Psychology
› track objects over time
› discount physically implausible trajectories
› distinguish animate agents from inanimate objects
› understand that other people have mental states like goals and beliefs

› Where can this come from? -> Unsupervised Learning
› Most of the learning performed by animals and humans is unsupervised. (no teacher)
› We learn how the world works by observing it.
› We learn that the world is 3-dimensional.
› We learn object permanence.

› We build a model of the world through predictive unsupervised learning.
(This predictive model gives us “common sense“)
(Slide is composition from: Yann LeCun, "How Could Machines Learn as Efficiently as Animals and Humans?" https://www.youtube.com/watch?v=uYwH4TSdVYs
, Sources: Baby http://www.mommyshorts.com/wp-content/uploads/2014/09/6a0133f30ae399970b0192aa1b4c77970d-800wi.jpg , Retina by Jerry Crimson 32
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018
Mann CC-BY-SA 3.0)

Learning Predictive Forward Models of the world.
› Task: Predict in which direction the Mikado sticks will fall
                                                                                             Y1
                          observation 1                                 observation 2   …

                                                                                                                            Y2
›     Problem: Invariant prediction: The training samples are merely representatives
      of a whole set of possible outputs (e.g. a manifold of outputs)
›     We need to represent a distribution. But how do you represent a distribution
      in high dimensional space?

›     Solution (one): Energy-Based Unsupervised Learning
                                                                                            Slide Credit: Yann LeCun
         ›    Idea: Take low value on data manifold, higher values everywhere else          Thx: Raphy for playing Mikado with me
Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                   33

Generative Adversarial Networks (GAN) [Goodfellow et al. 2014]

                                                           Real world                       Real
                                                            images
                                                                          Discriminator
                                                                        (Neural Network)
                 ”Noise”

                                                        Generator                           Fake
                                                     (Neural Network)

› The Generator network will try to generate fake images that fool the discriminator.
› The Discriminator network will try to distinguish between a real and a generated image.

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                  34

Welcome to the GAN Zoo
› Generate bedrooms - 2016

                                                                    Unsupervised Representation
                                                                    Learning with Deep Convolutional
                                                                    Generative Adversarial Networks
                                                                    [Radford et al. ICLR 2016]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                35

GAN Zoo
› Generate bedrooms, buildings, cats - 2017

                                                                    StackGAN++: Realistic Image
                                                                    Synthesis with Stacked Generative
                                                                    Adversarial Networks
                                                                    [Zhang et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                             36

GAN Zoo
› Generate celebrities 2018
                                                                    High resolution: 1024 x 1024 pixel

                                                                    Progressive Growing of GANs for
                                                                    Improved Quality, Stability, and
                                                                    Variation
                                                                    [Karras et al. 2018]

                                                                    IntroVAE: Introspective Variational
                                                                    Autoencoders for Photographic
                                                                    Image Synthesis
                                                                    [Huang et al. 2018]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                         37

GAN Zoo
› Face arithmetic

                                                                    StarGAN: Unified Generative
                                                                    Adversarial Networks for Multi-Domain
                                                                    Image-to-Image Translation
                                                                    [Choi et al. 2017]

                                                                                               Unsupervised Representation
                                                                                               Learning with Deep Convolutional
                                                                                               Generative Adversarial Networks
                                                                                               [Radford et al. ICLR 2016]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                           38

GAN Zoo
› Next Frame Prediction

                                                                                                                                                Deep multi-scale video prediction
                                                                                                                                                beyond mean square error
                                                                                                                                                [Mathieu et al. 2017]

                                                                                                                                                   Predicting Deeper into the Future
                                                                                                                                                   of Semantic Segmentation
                                                                                                                                                   [Luc and Neverova et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018   (Sources: https://cs.nyu.edu/~mathieu/iclr2016.html, https://github.com/facebookresearch/SegmPred )             39

GAN Zoo
› Image-to-Image translation

                                                                    Image-to-Image Translation
                                                                    with Conditional Adversarial
                                                                    Networks
                                                                    [Isola et al. 2017]
                                                                                                        Image-to-Image Demo
                                                                                                        https://affinelayer.com/pixsrv/

                                                                                            Unpaired Image-to-Image
                                                                                            Translation using Cycle-Consistent
                                                                                            Adversarial Networks
                                                                                            [Zhu and Park et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                         40

GAN Zoo
› Text-to-Image translation

                                                                    StackGAN++: Realistic Image
                                                                    Synthesis with Stacked Generative
                                                                    Adversarial Networks
                                                                    [Zhang et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                       41

GAN Zoo
› Image Colorization

                                                                    Colorful Image Colorization
                                                                    [Zhang, Isola, Efros 2016]

                                                                                                                  Style2Paints 2.1
                                                                                                                  https://github.com/lllyasviel/style2paints

                                                                    Scribbler: Controlling Deep Image Synthesis
                                                                    with Sketch and Color
                                                                    [Sangkloy et al. 2017]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                                        42

GAN Zoo
› Interactive drawing

                                                                    Generative Visual Manipulation on the Natural
                                                                    Image Manifold
                                                                    [Zhu et al. 2016]

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                   43

Whats next? (my prediction)

› We will see a lot more real world applications of
  Supervised Learning in many (new) domains.

› We will see more efficient Reinforcement Learning.
  (good for robotics)

› Research in Unsupervised Learning “just started“.

› Key to “stronger“ AI: Prediction + Planning = Reasoning.

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018   44

› We do AI and ML since 2006 / 2007 (Medieninformatik / Mobile Medien)

› Applied approach: How can we bring AI into production?
         › Lectures are split ~50/50 between theory and programming
         › Constantly growing number of students in AI lectures (last ML course was 60+)
         › NEW: ML specialization within the Computer Science and Media Master program.

         › Many AI related projects in: Gaming, Apps, Websites, Embedded Systems
         › 10 - 15 degree theses per semester (inhouse and with industry: Daimler, Bosch, Porsche etc.)

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                         45

AI @ HdM Stuttgart

› We go to Hackathons J

› Visit us:
  www.hdm-stuttgart.de/~maucher

› or come to the HdM Media Night!
      (next one is end of Winter Term 18/19 ~ end of January)

› Thank you!                                                        Daimler TSS Artificial Intelligence Garage – November 2017

Deep Learning @ HdM Stuttgart | Johannes Theodoridis | 09.08.2018                                                                46

You can also read