Model evaluation is very important since we need to understand how well our model is performing. In comparison to classification, the performance of a regression model is slightly harder to determine because, unlike classification, it is almost impossible to predict the exact value of a target variable. Therefore, we need a way to calculate how close our prediction value is to the real value.
There are different model evaluation metrics that are used popularly for regression models which we are going to dive into in the following sections.
Mean absolute error is a very intuitive and simple technique, therefore also popular. It is basically the average of the distances between the predicted and the true values. Basically, the distances between the predicted and the real values are also the error terms. The overall error for the whole data is the average of all prediction error terms. We take the absolute of the distances/errors to prevent negative and positive terms/errors from canceling each other.
Advantages
Disadvantages
MSE is one of the widely used metrics for regression problems. MSE is the the measure of the average of squared distance between the actual values and the predicted values. Squared terms help to also take into consideration of negative terms and avoid cancellation of the total error between positive and negative differences.
Advantages
Disadvantages
As the name already suggests, in RMSE we take the root of the mean of squared distances, meaning the root of MSE. RMSE is also a popularly used evaluation metric, especially in deep learning techniques.
Advantages
Disadvantages
R square is a different metric compared to the ones we have discussed until now. It does not directly measure the error of the model.
R-squared evaluates the scatter of the data points around the fitted regression line. It is the percentage of the target variable variation that the model considers compared to the actual target variable variance. It is also known as the “coefficient of determination” or goodness of fit.
As we can see above, R-squared is calculated by dividing the sum of squared error of predictions by the total sum of square, where the predicted value is replaced by the mean of real values.
R-squared is always between 0 and 1. 0 indicates that the model does not explain any of the variation in the target variable around its mean value. The regression model basically predicts the mean of the target variable. A value of 1 indicates, that the model explains all the variance in the target variable around its mean.
A larger R-squared value usually indicates that the regression model fits the data better. However, a high R-square model does not necessarily mean a good model.
import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
import seaborn as sns; sns.set_theme(color_codes=True)
X, y = make_regression(n_samples = 80, n_features=1,
n_informative=1, bias = 50, noise = 15, random_state=42)
plt.figure()
ax = sns.regplot(x=X,y=y)
model = LinearRegression()
model.fit(X, y)
print('R-squared score: {:.3f}'.format(model.score(X, y)))
X, y = make_regression(n_samples = 80, n_features=1,
n_informative=1, bias = 50, noise = 200, random_state=42)
plt.figure()
ax = sns.regplot(x=X,y=y)
model = LinearRegression()
model.fit(X, y)
print('R-squared score: {:.3f}'.format(model.score(X, y)))
Advantages
Disadvantages
It is still possible to fit a good model to a dataset with a lot of variance which is likely going to have a low R-square. However, it does not necessarily mean the model is bad if it is still able to capture the general trend in the dataset, and capture the effect of change of a predictor on the target variables. R-square becomes a big problem when we want to predict a target variable with a high precision, meaning with a small prediction interval.
A high R-squared score also does not necessarily mean a good model because it is not able to detect bias. Therefore, also checking the residual plots is a good idea. As we mentioned previously, a model with a high R-squared score can also be overfitting since it captures most of the variance in the data. Therefore, it is always a good idea to check the R-squared score of the predictions from the model and compare it to the R-squared score from the training data.
If you like this, feel free to follow me for more free machine-learning tutorials and courses!
Also Published Here