Too Long; Didn't Read
The choice of optimization algorithms in deep learning can influence the network training speed and its performance. In this article, we will discuss the need for improving the gradient descent optimization technique. Gradient descent algorithm updates the parameters by moving in the direction opposite to the gradient of the objective function with respect to the network parameters. The gradient descent algorithm for single sigmoid neuron works like this,Initialize the parameters randomly w and b and iterate over all the observations in the data. Then update the value of each parameter based on its gradient value. Then continue doing step 2 and 3 till loss function gets minimized.