Now that we know how to calculate the gradient, we want to take a “step” in that direction. However, it’s important to think about whether that step is too big or too small. We don’t want to overshoot the minimum error!
We can scale the size of the step by multiplying the gradient by a learning rate.
To find a new
b value, we would say:
new_b = current_b - (learning_rate * b_gradient)
current_b is our guess for what the
b value is,
b_gradient is the gradient of the loss curve at our current guess, and
learning_rate is proportional to the size of the step we want to take.
In a few exercises, we’ll talk about the implications of a large or small learning rate, but for now, let’s use a fairly small value.
Define a function called
step_gradient() that takes in
This function will find the gradients at
m_current, and then return new
m values that have been moved in that direction.
For now, just return the pair (
step_gradient(), find the gradient at
b_current and the gradient at
m_current using the functions defined before (
Store these gradients in variables called
m_gradient, and return these from the function instead of
Return them as a list.
Let’s try to move the parameter values in the direction of the gradient at a rate of
Create variables called
b_current - (0.01 * b_gradient)
m_current - (0.01 * m_gradient)
Return the pair
m from the function.
We have provided Sandra’s lemonade data once more. We have a guess for what we think the
m might be.
Call your function to perform one step of gradient descent. Store the results in the variables
Great! We have a way to step to new
m values! Next, we will call this function a bunch, in order to move those values towards lower and lower loss.