In this post, we will see how to implement the perceptron model using breast cancer data set in python.
A perceptron is a fundamental unit of the neural network which takes weighted inputs, process it and capable of performing binary classifications. This is a follow up to my previous post on the Perceptron Model.
If you want to skip the theory and jump into code directly click here.
Disclaimer: The content and the structure of this article is based on the deep learning lectures from One-Fourth Labs — Padhai.
In the perceptron model inputs can be real numbers unlike the Boolean inputs in MP Neuron Model. The output from the model will still be binary {0, 1}. The perceptron model takes the input x if the weighted sum of the inputs is greater than threshold b output will be 1 else output will be 0.
The main goal of the learning algorithm is to find vector w capable of absolutely separating Positive P (y = 1) and Negative N(y = 0) sets of data. Perceptron learning algorithm goes like this,
Fig 2— Perceptron Algorithm
To understand the learning algorithm in detail and the intuition behind why the concept of updating weights works in classifying the Positive and Negative data sets perfectly, kindly refer to my previous post on the Perceptron Model.
The data set we will be using is breast cancer data set from sklearn. The data set has 569 observations and 30 variables excluding the class variable. The breast cancer data is an imbalanced data set, that means the classes ‘0’ and ‘1’ are not represented equally. In this example, we are not going to perform any sampling techniques to balance the data because this is a simple implementation of the perceptron model.
Class Imbalance
Before start building the Perceptron Model, first we need to load the required packages and the data set. The data set is present in the sklearn datasets module. Once we load the data, we need to grab the features and response variables using breast_cancer.data
and breast_cancer.target
commands.
Perceptron Preprocessing
After fetching the X and Y variables, we will perform Min-Max scaling to bring all the features in the range 0 — 1. Before building the model, we will split the data so that we can train the model on training data and test the performance of the model on testing data. We will use sklearn’s train_test_split
function to split the data in the ratio of 90:10 for training and testing respectively. Now that we are done with preprocessing steps, we can start building the model. We will build our model inside a class called perceptron.
In the perceptron class, we will create a constructor function def__init__
. The constructor initializes the weights vector w and threshold b to None.
Perceptron Model
The function model
takes input values x as an argument and perform the weighted aggregation of inputs (dot product between w.x) and returns the value 1 if the aggregation is greater than the threshold b else 0. Next, we have the predict
function that takes input values x as an argument and for every observation present in x, the function calculates the predicted outcome and returns a list of predictions.
Finally, we will implement fit
function to learn the best possible weight vector w and threshold value b for the given data. The function takes input data(x & y), learning rate and the number of epochs as arguments.
Perceptron Model Execution
Once we have our class ready, we initialize a new perceptron class object and using that object we will call fit
method on our training data to learn the best possible parameters. We will evaluate the model performance on the test data by calculating the testing accuracy.
The entire code discussed in the article is present in this GitHub repository. Feel free to fork it or download it.
You can try out a few possible improvements to increase the accuracy of the model,
In this article, we have seen how to implement the perceptron algorithm from scratch using python.
Connect with MeGitHub: https://github.com/Niranjankumar-c LinkedIn: https://www.linkedin.com/in/niranjankumar-c/