paint-brush
Simple linear regression using python without Scikit-Learnby@vyashemang
7,080 reads
7,080 reads

Simple linear regression using python without Scikit-Learn

by Hemang VyasJune 15th, 2018
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This is my first story in medium, in this story I am going to explain “How to Implement simple linear regression using python without any library?”.

Coin Mentioned

Mention Thumbnail
featured image - Simple linear regression using python without Scikit-Learn
Hemang Vyas HackerNoon profile picture

This is my first story in medium, in this story I am going to explain “How to Implement simple linear regression using python without any library?”.

Although I have used some basic libraries like pandas, numpy and matplotlib to get dataset, to solve equation and to visualize the data respectively.

You will find the notebook which I have created using sklearn and the dataset in github repository.

I have explained the code below.


import numpy as npimport matplotlib.pyplot as plt



class Regression:def __init__(self):pass

def find\_sum(l, p):  
    res = 0  

    for i in l:  
        res += i\*\*p  
      
    return res  

def find\_mul\_sum(l1, l2):  
    res = 0  

    for i in range(len(l1)):  
        res += (l1\[i\]\*l2\[i\])  
      
    return res  

def solve\_equ(sum\_x, sum\_x2, sum\_y, sum\_xy):  
    # Equation no 1  
    # Ey = a \* Ex + b \* n  

    # Equation no 2  
    # Exy = a \* Ex^2 + b \* Ex  

    n = 30  

    p = np.array(\[\[sum\_x,n\], \[sum\_x2,sum\_x\]\])  
    q = np.array(\[sum\_y, sum\_xy\])  

    res = np.linalg.solve(p, q)  

    return res  

def predict(x, res):  
    y\_pred = \[\]  

    for i in x:  
        y\_pred.append(res\[0\] \* i + res\[1\])  

    return y\_pred  


def main():x = [1.1,1.3,1.5,2,2.2,2.9,3,3.2,3.2,3.7,3.9,4,4,4.1,4.5,4.9,5.1,5.3,5.9,6,6.8,7.1,7.9,8.2,8.7,9,9.5,9.6,10.3,10.5]

y = \[39343,46205,37731,43525,39891,56642,60150,54445,64445,57189,63218,55794,56957,57081,61111,67938,66029,83088,81363,93940,91738,98273,101302,113812,109431,105582,116969,112635,122391,121872\]  

r = Regression  

sum\_x = r.find\_sum(x, 1)  
sum\_y = r.find\_sum(y, 1)  
sum\_x2 = r.find\_sum(x, 2)  
sum\_xy = r.find\_mul\_sum(x, y)  

res = \[\]  

res = r.solve\_equ(sum\_x, sum\_x2, sum\_y, sum\_xy)  

y\_pred = r.predict(x, res)  

plt.scatter(x, y, color = 'red')  
plt.plot(x, y\_pred, color = 'blue')  
plt.title('Ownression')  
plt.xlabel('X')  
plt.ylabel('Y')  
plt.show()  


if __name__ == "__main__":main()

Here as you might get the idea that I have created the class Regression with necessary method and for sake of simplicity I have used the basic sample data in X and Y.

First method in class I have created finds the sum of the list with power if you know how to get regression co-efficient on paper then it should not be a problem for you.

After getting all the sum we have to create two equations as we are using Least Square Method.

To solve the equation I have used numpy’s method named linalg.solve. By solving the equation we will get one constant which we will use to get the value from x for test dataset. My method to solve equation will return the list of two unknowns “y = a * x + b” here it’ll return a and b.

In predict method it will create the list named y_pred is a list of predicted values of the values that is been passed as a test.

Finally, in the main method it will apply all the methods that I have used and will also plot the graph where points in red color shows the actual values and the blue line shows the predicted values.

Hope you liked the article. If you have any kind of question related to this article let me know.

Thank you for reading. :)