Deep Learning From Scratch Series: A Simple Neural Network [Part 1]

Written by jeff-ridgeway | Published 2020/05/30
Tech Story Tags: deep-learning | tutorial | jupyter-notebook | python | machine-learning | newbie | artificial-intelligence | numpy

TLDR Deep Learning From Scratch Series: A Simple Neural Network [Part 1] This series assumes you know some high school maths to get the most from the notes takeaways sections. This series will be more-so a companion alongside the book and Jupyter notebooks. The book is based on reading the trial version and being impressed with Andrew Traskā€™s articulation of complex topics in easy-to-understand language. The posts will be over-arching in scope and have the ironing out of more difficult topics in DL.via the TL;DR App

Photo from Pinterest Here -> this screenshot comes from a Martin Episode that you can watch here and get a good laugh šŸ˜‚
Note: As an Amazon Associate I earn from qualifying purchases. I get commissions for purchases made through links in this post. See a more full disclaimer here
Another Note: This series assumes you know some high school maths to get the most from the notes takeaways sections.

Bruh, Why are you starting from scratch? Thereā€™s PyTorch, Tensorflow, etcā€¦

Around Thanksgiving 2019, I was able to visit my grandparents and talk with my grandfather. That conversation was so life-giving in so many ways and will stay in my memory. One of the golden nuggets of wisdom he told me was:
ā€œIf you wish to get good at anything, learn the theory behind what you seek to learn/masterā€.
This wisdom has stayed with me to this day, including in starting this Deep Learning (DL) journey. I am more much more interested in learning the fundamentals and sharing what Iā€™ve learned with others (in my language of course), than diving headfirst into the most popular frameworks. While I am not against frameworks (and will use them later), this is the route I have chosen for myself. Ā 
In this scenario, Iā€™d rather learn what makes the car drive, then to just get in the car and drive šŸ¤·šŸ¾ā€ā™‚ļø. To continue with the metaphor, you canā€™t learn about the car without the manual, so below is my chosen manual šŸ‘‡šŸ¾.

Resource used during this journey

Out of the multitude of books and online learning resources, Iā€™ve picked this as my beginning guide into the DL world. This pick was based on reading the trial version and being impressed with Andrew Traskā€™s articulation of complex topics in very easy-to-understand language. For someone starting on the theory side of DL, I view this as very important.
Therefore, the posts in this series will be more-so a companion alongside the book. Itā€™s recommended that you purchase the book as posts and Jupyter notebooks (more on those here, if you are unfamiliar) that follow will be derived from the bookā€™s content.
If you are looking to purchase the ebook version, which is relatively cheaper, go to Manning Publications or if you like having physical copies of the book check out the Amazon link below šŸ‘‡šŸ¾: https://amzn.to/2WYov5u
Without further ado, letā€™s get started šŸ˜ƒ

Jupyter Notebook & Major Takeaways From Chapter 2 & 3

Seeing as the book is more in-depth, the takeaways in the series will be a summarization of what I took from the chapters (and other thoughts) and the link to my Jupyter notebook at the end. My Jupyter notebooks go deeper into the concepts explained in the book with code and pictures/diagrams. Thus, these blog posts will be over-arching in scope (big-picture) and have the ironing out of more difficult topics in DL. So letā€™s dive into some big takeaways from Chapter 2 and 3 šŸŠšŸ¾ā€ā™‚ļø

Chapter 2 Summaries/Notes

Artificial Intelligence (AI) and DL are not synonymous! Rather DL, and even greater Machine Learning (ML), is a subset of ML which is a subset of AI.
Supervised Machine Learning is just a fancy word for ā€œtaking /what you know/ as input and quickly transforming it into /what you want to know/. [1]
Unsupervised Machine Learning is wanting to know how your input data relates to each other without having prior knowledge/labeling of what the data is exactly.

Chapter 3 Summaries/Notes

A Neural Network helps you make a prediction based on the input values given and their corresponding weights. For a simple example, your favorite memorized math formula from middle school and high school, y = mx + b would be considered a ā€œneural networkā€.
  1. y -> is the predicted value
  2. m -> would be the weight/ā€œknobā€ which to alter the predicted value of the input. Andrew says, ā€œAnother way to think about a neural networkā€™s weight value is as a measure of sensitivity between the input of the network and its prediction. Ā If the weight is very high, then even the tiniest input can create a really large prediction! [2]
  3. x -> your input values
  4. b -> bias (chapter 3 doesnā€™t discuss this at all, but I found out this is bias from Becoming Human)
When a network has multiple inputs, the weighted sum is multiplication of the weights to their corresponding inputs and then the summation of those values. This weighted sum is the y when you have multiple inputs (y = w1x1 + w2x2 + w3x3)
Multiple inputs and multiple output:
  1. y1 = w1x1 + w1x2 + w1x3
  2. y2= w2x1 + w2x2 + w2x3
  3. y3= w3x1 + w3x2 + w3x3
y1 y2, y3 correspond to your new outputs respectively. As you can see above, the x values never change but the set of weights do.
Stacked Neural Networks -> youā€™re making predictions on your predictions, with new sets of weights šŸ¤Æ (following the previous bullet):
  1. newY1 = newW1y1 + newW1y2 + newW1y3
  2. newY2 = newW2y1 + newW2y2 + newW2y3
  3. newY3 = newW3y1 + newW3y2 + newW3y3
newY1, newY2, newY3 would be your final values in neural network while y1, y2, y3 would be considered your ā€œhiddenā€ values

Jupyter Notebook

Like what was stated in the beginning, the Jupyter notebook that will be attached to each post in this series will go more in-depth with code, diagrams, and my explanation of what the book is covering. Check out the notebook below and leave any comments on Kaggle šŸ‘‡šŸ¾

Next Sunday will be an overview of Gradient Descent!

Until next time āœŒšŸ¾
References:
  • [1] ā€œFundamental Concepts: How Do Machines Learn?ā€ /Grokking Deep Learning/, by Andrew W. Trask, Manning Publications, 2019, p. 95.
  • [2] ā€œIntroduction to neural prediction: forward propagationā€ /Grokking Deep Learning/, by Andrew W. Trask, Manning Publications, 2019, p. 126.

Written by jeff-ridgeway | Software Engineer/Blogger
Published by HackerNoon on 2020/05/30