7.4 The ‘deep’ in deep learning

  • Deep learning = subfield of machine learning
    • New take on learning representations from data that puts an emphasis on learning successive layers of increasingly meaningful representations
  • “deep” stands for this idea of successive layers of representations
    • Depth of model = how many layers contribute to a model of the data
    • Modern deep learning often involves tens or even hundreds of successive layers of representations (all learned automatically from exposure to training data)
  • Layered representations are (almost always) learned via models called neural networks (structured in literal layers stacked on top of each other)
  • “Neural network”: Reference to neurobiology as central DL concepts inspired by inspiration from understanding brain (BUT no evidence that brain implements mechanisms used in DL!)
  • Deep learning is a mathematical framework for learning representations from data
  • What do the representations learned by a deep-learning algorithm look like? Let’s examine how a network several layers deep (see figure 1.5) transforms an image of a digit in order to recognize what digit it is.
Source: Chollet and Allaire (2018) , Chap. 1.1.4
  • Below: Network transforms the digit image into representations that are increasingly different from the original image and increasingly informative about the final result
  • Deep network = multistage information-distillation operation, where information goes through successive filters and comes out increasingly purified (that is, useful with regard to some task)
  • Deep learning (technically): a multistage way to learn data representations
    • Simple idea but very simple mechanisms, sufficiently scaled, can end up looking like magic
Source: Chollet and Allaire (2018) , Chap. 1.1.4