Artificial neural networks (ANNs) and especially deep neural networks (DNNs) have transformed the way we interact with computers. We are going to study how they function, starting with a look at the heart of the artificial neural network (that’s based on the model of how a biological neuron works), and then a brief history of ANNs. We then look at a simple example of a neural network that has been programmed and trained.
First, we look at the biological neuron to gain an understanding of the model of ANNs:
Neurons are used for processing information, sights, sounds, senses and smells.
The neuron (in our brains) has four basic components:
(Synapses are also really important. However, they aren’t physical parts of the neuron; they are gaps between the dendrites and the axon terminals, where the electrical impulse will be converted to a chemical signal)
The three main kinds of a neuron:
Sensory neurons:
Interneurons:
Motor neurons:
Motor Neurons are effectively the opposites of sensory neurons, the information they receive decides what they will do to affect their environment or tell an organ or muscle what action it needs to perform. The term efferent neurons can also be used when referring to them.
There are 4 key steps to the conveying of information between neurons:
We need these steps because the electrical impulses cannot travel across the gaps; this is why they must be converted to chemical signals.
A neuron fires based on whether the “action potential” is above or below the threshold level; the action potential is the electrical stimulus that releases the neurotransmitters. If it isn’t above the threshold level, the neuron will not fire, if it is higher than the threshold level it will fire. Each neuron in our brain fires approximately 200 times a second, and each time a neuron fires, the information is passed on to 1,000 other neurons. Our brain contains around 100,000,000,000 neurons.
Artificial Neural Networks are based on the template of our brain. An ANN will receive data (the inputs) and then produce an output. ANNs are made up of multiple layers of artificial neurons that are connected in a similar way as the biological neurons in the brain. Each neuron receives signals from its connected neurons and fires if the sum of the signals passes the threshold of the activation functions. As shown in Diagram 4.
The Threshold Level in the Artificial Neurons decides whether the output will be closer to 1 or 0, similar to a neuron firing or not firing. Another similarity is the fact that one neuron cannot work by itself to complete complicated functions or tasks; an ANN has several layers; input layers, hidden layers, and output layers. Each layer will have 'nodes'. The nodes are similar to neurons; they receive input (or several inputs) from another node, apply a function or put it through a formula, and then pass on the output to another node. The hidden layers are called 'hidden' because they have no connection to the outside world, all the information they receive is from the input layer and their output goes to the output layer, not the outside world.
A neuron is a 'mathematical function'; it can take any number of inputs (numbers) and can be trained [through backward and forward propagation] to respond in a certain way (give the wanted output). We train a neural network by giving it inputs and giving the outputs we want it to give us back, the inputs are plugged into a sigmoid function. We adjust these factors, we apply them to the inputs called weights.
Weights add importance to certain factors, for example, if you are going for a walk, deciding if the weather is good is less important than deciding if it is dangerous or not, we would add a bigger weight to the danger input so that it affects our output. We adjust weights so that they add importance, but other inputs are also taken into account when the output is decided. The weights and biases are altered until the output we are given closely resembles the output we want. The loops we run are called Epochs, each time we change a variable (the weights and biases) and the numbers will be propagated forward and back again to check how much closer to the true output it is.
An example of supervised learning in real life is a toddler learning to recognize objects, each time the toddler sees a cat if you tell the toddler that it is a cat it is supervised learning you are making the connections for the toddler, instead of them making their connections between the animals by itself. Unsupervised learning is allowing the toddler to figure out that the animal is a cat, on its own. Unsupervised learning allows for more unusual patterns to recognize the animal.
In ANNs, the supervised model will have the inputs and outputs; the toddler gets the input (seeing the cat) and the output (getting told it is a cat). In an unsupervised model, it will only get the input, no output; the toddler only gets the input (seeing the cat) and isn't given the output (doesn't get told that it is a cat). Supervised learning uses algorithms that are trained using labelled data, unlike unsupervised learning where the data isn't labelled. In Unsupervised learning, the number of classes isn't known to the programmer, in supervised learning the programmer knows and is more in control of the classes. A program in python learns how to train and validate with sample data sets.
However, ANNs haven't been around for very long…
The first influential idea for neural networks was in 1943. This idea was prompted by a pair of men called Warren McCulloch and Walter Pitts. This idea was taken further by a book called The Organization of Behavior, written by Donald Hebb. The book was written in 1949, approximately 6 years later. Donald Hebb suggested that the longer we use particular neural pathways, the more powerful they become. That was one of the fundamental theories of the book, as it is often paraphrased by the saying ‘Cells that fire together, wire together.’ This is also because of one of the models Hebb proposed – Hebbian Learning. The model is based on brain neural plasticity. Neural plasticity is the brain's capacity to reorganize itself by creating new neural connections throughout life. Neuroplasticity enables the neurons in the brain to account for damage and illness and to improve their actions in response to new conditions or differences in their environment.
Another idea forerunner to neural networks is ‘Threshold Logic’.
These ideas were both formed in the 1940s, however, it wasn’t till 1954 for the first Hebbian network was successfully built at MIT. Around the same time, a psychologist called Frank Rosenblatt studied flies and their flight or fight responses. This system is located in the eye of a fly. He later, in 1958 suggested the Perceptron. He called it the Mark I Perceptron; the Perceptron consisted of an activation function, weights and summation processor. It's 1959 and two models have been programmed, one of which has real-life applications (This was the first time a neural network had a real-life application). They were called ADALINE and MADALINE; in fact, MADALINE is still in use now.
Suddenly, all development in the field was brought to an abrupt halt as fear of its unknown capabilities and other issues arose, all funding was cut and influential people started to turn against the research of ANNs; this ' AI winter' lasted until 1982 when John Hopfield submitted a paper to the National Academy of Sciences. It wasn't until the 1990s that ANNs were back for good, and even though ANNs are a fairly recent exploration, they already have big impacts on our lives right now...
A common use of ANNs is in photo recognition, on Facebook when it auto tags a person in a picture, or Google Photos when it makes albums based on who is in the photo. A similar use is in character recognition, reading human handwriting. However, this can be unreliable as the computer isn't human and doesn't account for the variations in people's writing. They can even be used in music composition, by recognizing patterns in the music. ANNs are also being used for more serious and useful regular applications like in hospitals they can be used to recognize cancer tumours. ANNs can be trained so robots can carry out human tasks and won't need a human operating them.
One of the most common ideas for neural networks to be used in the future is driverless cars, something that could be real sooner than we think. They are also particularly useful in the current time because they are used in autocorrect, and show you ads or music recommendations based on what you listen to, websites you visit or videos you watch. When analyzing the stock market, ANNs can make some accurate predictions about the stock price. One particular field where ANNs and AI can make a particular difference is Medicine; as mentioned before in diagnosing cancers and recognizing tumours, they can also be used in predictions of cancer. However, their accuracy can vary from “50% to 100%” [https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0212356].
With more and more hospitals digitalizing hospital records and diseases and their treatments, the ANNs can evaluate the effectiveness of treatments and cures. Other uses of ANNs are speech recognition and spell checking.
The future of ANNs is majorly influenced by their current uses in our lives...
What could affect the future of ANNs?
One factor that could affect how much ANNs are used/valued is their data requirements, by this I mean the learning period during which we give the neural network the data they need to start recognizing patterns. ANNs use data to figure out if two things are the same, based on their structure or appearance. Quantitative Analyst Finance Specialist Software Developer, Prakash Kannan thinks that in the future ANNs will be able to mimic basic human cognitive functions and have a vague understanding of the basic emotional models of humans; “There may not be much difference between human cognitive behaviour and AI behaviour”. Of course, AI will probably never have the emotional depth and understanding as humans (in my opinion and based on articles I have read) or if they do, not for a very long time. Many people are and will lose their jobs because of AI; Labor tasks such as in factories will be done by efficient and stronger robots, and people who analyze data for patterns etc. will also be replaced eventually. Many people in HR will also lose their jobs and some are doing so now as well. AI will be much more efficient and cost less, accuracy will also be improved; AI however still needs humans to program them, and provide data and other tasks it cannot function without. Many new jobs will be opened such as Data Scientists.
What do you think of ANNs?
I think it’s the next evolution in the applications of computers that bring them closer to human thinking and cognition.
What do you think the future of ANNs will be?
I believe ANNs will start mimicking most of the basic human cognitive functions like vision, sound etc. It will take some time for the ANNs to achieve the basic emotional models of human beings, but I would not be surprised if someone comes up with a model of human emotions. Truly there will not be much distinction between human thinking and AI thinking.
How useful do you think ANNs will be in the future?
It would be very useful in day-to-day applications where there is a vast amount of data and the hum brain is currently overwhelmed in processing them. So ANN would be able to reduce the vast amount of data to a few decision choices or could present its decision to be validated by a human.
How useful are they now?
They are very useful in classifying vast amounts of data (images, email etc) into categories.
What AI or ANNs have you come across in everyday life?
Email classification to junk automatically, search engines like Google, Alexa, self-driving cars etc
Have you used tensor flow? How useful is it?
Yes, I have started using it for experimenting with financial models.
What do you are the main ways you can use AI/ANN in your job?
Currently, I am investigating the possibility of using ANNs in the classification of huge amounts of financial data and the calibration of financial models to market prices. I am using Python and Tensor flow for this.
Has AI changed our lives? In a big way or small way?
Yes, especially in medicine where the accuracy of detection of cancer/tumours etc has improved vastly and leads to early detection and cure.
What new jobs could be opened because of AI and ANNs?
Data Analyst, data scientist etc. Research on new network models.
Do you think AI can be dangerous?
Yes, as ANN work like a black box, the solutions and values output by ANNs should be thoroughly validated in case of critical applications. There are issues of overfitting etc where the assumption of learning is not valid and if the inputs are outside the range of “learned values” the output may not be valid. It may be ok to periodically misclassify junk mail or an image, but it could be disastrous if such errors happen in self-driving cars or military defence etc.
What are the major AI and ANNs companies?
Google, Facebook, Amazon, and microdots have all already invested heavily in AI/ANN and will continue to invest a big chunk in future.
What languages can be/ have you used to program ANNs or AI?
Tensorflow library, Keras, Python- currently python is the default standard for developing ANN.
http://www.interactive-biology.com/3950/the-chemical-synaptic-transmission-how-it-happens/
https://faculty.washington.edu/chudler/lesson1.html
http://www.interactive-biology.com/3950/the-chemical-synaptic-transmission-how-it-happens/
https://www.brainfacts.org/core-concepts/how-neurons-communicate
https://faculty.washington.edu/chudler/synapse.html
https://study.com/academy/lesson/neurons-lesson-for-kids.html
https://faculty.washington.edu/chudler/cells.html
https://courses.lumenlearning.com/boundless-psychology/chapter/neurons/
https://medium.com/analytics-vidhya/brief-history-of-neural-networks-44c2bf72eec
https://towardsdatascience.com/a-concise-history-of-neural-networks-2070655d3fec
http://thephenomenalexperience.com/content/how-fast-is-your-brain/
https://cs.stanford.edu/people/eroberts/courses/soco/projects/neural-networks/Future/
https://www.tensorflow.org/tutorials/quickstart/beginner
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(x):
return x * (1 - x)
# Input datasets
inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
expected_output = np.array([[0], [1], [1], [0]])
epochs = 100000
lr = 0.1
inputLayerNeurons, hiddenLayerNeurons, outputLayerNeurons = 2, 25, 1
# Random weights and bias initialization
hidden_weights = np.random.uniform(size=(inputLayerNeurons, hiddenLayerNeurons))
hidden_bias = np.random.uniform(size=(1, hiddenLayerNeurons))
output_weights = np.random.uniform(size=(hiddenLayerNeurons, outputLayerNeurons))
output_bias = np.random.uniform(size=(1, outputLayerNeurons))
# class thingy here!!
class Training_Class:
def forward_propagation(output_weights,output_bias):
hidden_layer_activation = np.dot(inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
def Backward_propagation(out_put,output_weights,output_bias):
# Backpropagation
error = expected_output - predicted_output
d_predicted_output = error * sigmoid_derivative(predicted_output)
error_hidden_layer = d_predicted_output.dot(output_weights.T)
d_hidden_layer = error_hidden_layer * sigmoid_derivative(hidden_layer_output)
# Updating Weights and Biases
output_weights += hidden_layer_output.T.dot(d_predicted_output) * lr
output_bias += np.sum(d_predicted_output, axis=0, keepdims=True) * lr
hidden_weights += inputs.T.dot(d_hidden_layer) * lr
hidden_bias += np.sum(d_hidden_layer, axis=0, keepdims=True) * lr
# Training algorithm
for _ in range(epochs):
hidden_layer_activation = np.dot(inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
# Backpropagation
error = expected_output - predicted_output
d_predicted_output = error * sigmoid_derivative(predicted_output)
error_hidden_layer = d_predicted_output.dot(output_weights.T)
d_hidden_layer = error_hidden_layer * sigmoid_derivative(hidden_layer_output)
# Updating Weights and Biases
output_weights += hidden_layer_output.T.dot(d_predicted_output) * lr
output_bias += np.sum(d_predicted_output, axis=0, keepdims=True) * lr
hidden_weights += inputs.T.dot(d_hidden_layer) * lr
hidden_bias += np.sum(d_hidden_layer, axis=0, keepdims=True) * lr
print("Initial hidden weights: ", end='')
print(*hidden_weights)
print("Initial hidden biases: ", end='')
print(*hidden_bias)
print("Initial output weights: ", end='')
print(*output_weights)
print("Initial output biases: ", end='')
print(*output_bias)
print("The input: [1,1]")
test_inputs = ([[1, 1]])
hidden_layer_activation = np.dot(inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
hidden_layer_activation = np.dot(test_inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
print("This is the predicted output: ")
print(predicted_output)
print("The input: [0,1]")
test_inputs = ([[0, 1]])
hidden_layer_activation = np.dot(inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
hidden_layer_activation = np.dot(test_inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
print("This is the predicted output: ")
print(predicted_output)
print("The input: [1,0]")
test_inputs = ([[1, 0]])
hidden_layer_activation = np.dot(inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
hidden_layer_activation = np.dot(test_inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
print("This is the predicted output: ")
print(predicted_output)
print("The input: [0,0]")
test_inputs = ([[0, 0]])
hidden_layer_activation = np.dot(inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
hidden_layer_activation = np.dot(test_inputs, hidden_weights)
hidden_layer_activation += hidden_bias
hidden_layer_output = sigmoid(hidden_layer_activation)
output_layer_activation = np.dot(hidden_layer_output, output_weights)
output_layer_activation += output_bias
predicted_output = sigmoid(output_layer_activation)
print("This is the predicted output: ")
print(predicted_output)