Softmax Regression This is part 2 of a 5 article series: Training an Architectural Classifier: Motivations Training an Architectural Classifier: Softmax Regression Training an Architectural Classifier: Deep Neural Networks Training an Architectural Classifier: Convolutional Networks Training an Architectural Classifier: Transfer Learning A personal side goal of this project is becoming more aquainted with deep learning frameworks, so although sklearn and the like may have a Logistic Regression module, I’ll be doing this more manually in TensorFlow. You’ll also see tf.slim and Keras. Working with the concept that its better to start with a and only add complexity if , I’ll start by trying simple logistic/softmax regression. In short, the goal of logistic regression is to make a prediction by taking an input image, and multiplying all of its features (pixels in this case) by a set of positive or negative weights, then adding a bit of bias. simpler, more explicable model nessecary This should sound familiar to any one with some math experience, it’s the equation of a line: , except in this case our line exists in VERY high dimensional space (m, x, and b are high dimensional matrices instead of the scalars you used in school). This makes some intuitive sense when you consider that what we are attempting to do is draw a line, or , through space that can seperate images of one class from those in another. y=mx+b hyper-plane The basic formula for making a prediction — or drawing a line. These weights represent the learned likelyhood of a pixel contributing positively or negatively to the overall image being in a certain class. Thus the value of the pixel, multiplied with the learned weight, gives a kind of “vote” towards the final result. Using a softmax function, these votes are then converted to probabilities that an image belongs to a given class. Although I’ve used the terms logistic and softmax interchangeably, this is the primary difference between logistic and softmax regression, softmax will accomplish what logistic does but across multiple classes. softmax function, converts linear inputs to probabilities The weights are learned through an iterative process called and whereby error is attributed to specific weights with each prediction, and that weight is modified up or down and then tried again. In this case, we use an error metric called cross-entropy that collects the the average of the product of the true category multiplied by the negative log of of the predicted category. gradient descent back-propogation cross-entropy loss, collects the incorrect predictions into a single metric It’s a simple model, so I’ll let the notebook do the rest of the talking here: Bottom line, we can do better. Best case accuracy, even after longer training periods, was about . Here’s the tensorboard graph of accuracy over 5000 epochs: 57% Accuracy over training epochs. Blue: training accuracy, Purple: validation accuracy The model’s accuracy, even on training data, is well below human accuracy, indicating that the model is likely not complex enough to extract meaningful information from the huge number of features it is being given. The big split between training and validation accuracy also indicates that the model is to the information that it able to extract. overfitting is So what if we expand this single neuron classifier into a deep neural net? That will come in my next post: Next Up: Architectural Classifier — Deep Neural Networks https://upscri.be/hackernoon/

Training an Architectural Classifier — II

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

React/Redux with Mapbox

10 Lessons from 10 Years of AWS (part 1)

10 Lessons from 10 Years of AWS (part 2)

111 Stories To Learn About Architecture

13 Expert Tips to Improve Your Web Application Performance Today

4 Skills You Need to Become a Distinguished Developer

React/Redux with Mapbox

10 Lessons from 10 Years of AWS (part 1)

10 Lessons from 10 Years of AWS (part 2)

111 Stories To Learn About Architecture

13 Expert Tips to Improve Your Web Application Performance Today

4 Skills You Need to Become a Distinguished Developer

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps