Neural networks are a hot topic in the technology industry today, partially because they make a cameo appearance in many everyday devices. From your phone’s camera to an Alexa to even a toothbrush - companies and organizations are jumping on the AI hype train.
Whether some of these are appropriate uses for neural networks and AI is up for debate, but understanding how neural networks work will not only give you a bullet point on your resume but will also enable you to know when to use them in real-world situations.
Of course, before you can code neural networks in any language or toolkit, first, you must understand what they are.
What is this AI thing anyway?
Artificial Intelligence (AI), obviously, refers to machines attempting to do tasks that humans would otherwise do. There are many branches within AI, such as robotics and data mining, but in this post, we will focus on one specific subset, machine learning, because it’s easy to get lost in the broad subject. Neural Networks are a part of the machine learning discipline, so it helps to understand ML first.
According to Wikipedia:
Machine learning (ML) is the study of computer algorithms that improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to do so.
In other words, we are feeding large amounts of test data into a system to teach it which data is correct or incorrect, true or false, etc. The idea is similar to using flashcards to memorize words or definitions and then testing yourself with true or false questions. Eventually, you will be able to identify words and maybe even their synonyms without the help of flashcards. Similarly, in machine learning, the system will correctly classify objects without the need for test data or human intervention.
Different machine learning algorithms commonly use supervised learning to identify a target object or use unsupervised learning to identify a distribution in many objects. Other algorithms exist but are less used in business settings.
Now back to the point:
Neural networks are entities that are chained together in layers to do one processing function. By stacking these entities, which we call neurons (which by the way have nothing to do with brain neurons other than the name similarity), into layers, we can perform complex processing from neurons that do simple things.
Neural networks are composed of three layers. There is:
Image credit: https://medium.com/@societyofai/introduction-to-neural-networks-and-deep-learning-6da681f14e6
Neural networks are used in machine learning algorithms to do the actual classification. Each layer has several neurons, and each of them processes a fragment of the input data, starting at the input layer, which splits apart the input data into chunks in an application-defined way. Each layer of the hidden layers processes each chunk to make an output which is eventually transmitted to the output layer.
Actually, a neuron’s computation is very simple. It takes a numerical input, multiplies it by a weight value, and then passes it as an output to a neuron at the next layer. The key is that all the neurons have different but predictable weighting values.
As you saw in the image above, some of the layers have a different number of neurons. However, the layers do not all have to have the same number of neurons.
Consequently, some neurons may take more than one input from the previous layer and send output to more than one neuron in the next layer.
Now, what happens when a neuron takes multiple inputs?
Well, after they are weighted, most likely with different weight values. We also add a constant number called a bias to it to get the final output. By the way, all neurons have a different bias, even single input neurons.
Finally, each neuron has a special function that takes the sum of the weighted inputs plus the bias as a single argument. We will see such functions in the next section.
Here are some practical examples of weighting functions in neurons that you will encounter in production machine learning programs. There are many more weighting functions besides the ones listed here.
Binary step function
Linear function
Sigmoid function
Hyperbolic function
Recitified Linear Unit (ReLU) function
Leaky ReLU function
Maxout function
Here is another image that hopefully will help you understand what a neuron does.
Image credit: https://towardsdatascience.com/introduction-to-neural-networks-ead8ec1dc4dd
Thanks for reading. In the next few posts, we will learn more about neural networks, explore some machine learning toolkits, and learn ML programming in various languages.
Cover Image by @FotoArtist via Twenty20
Previously published here.