RMSProp
Published Jan 29, 2025
Contribute to Docs
RMSProp is an adaptive learning rate optimization algorithm that maintains moving averages of the squared gradients to normalize parameter updates. This approach helps stabilize and accelerate training, especially in scenarios with noisy gradients or non-stationary objectives.
Syntax
torch.optim.RMSprop(
params,
lr=0.01,
alpha=0.99,
eps=1e-08,
weight_decay=0,
momentum=0,
centered=False
)
params
: An iterable of parameters to optimize (such asmodel.parameters()
).lr
: The learning rate (default is0.01
).alpha
: Smoothing constant used in the moving average (default is0.99
).eps
: Term appended to the denominator to improve numerical stability (default is1e-08
).weight_decay
: L2 penalty (default is0
).momentum
: Momentum factor (default is0
).centered
: Computes the gradient variance centered by the gradient mean if set toTrue
(default isFalse
).
Example
The following code snippet demonstrates how RMSProp can be utilized to train a simple neural network with PyTorch:
import torchimport torch.nn as nnimport torch.optim as optim# Sample model: a simple feed-forward networkmodel = nn.Sequential(nn.Linear(10, 5),nn.ReLU(),nn.Linear(5, 1))# Loss functioncriterion = nn.MSELoss()# Optimizer: RMSPropoptimizer = optim.RMSprop(model.parameters(), lr=0.01, alpha=0.99, momentum=0.9)# Dummy input and targetx = torch.randn(2, 10) # batch size = 2, input features = 10target = torch.randn(2, 1)# Forward passoutput = model(x)loss = criterion(output, target)# Backward pass and parameter updateloss.backward()optimizer.step()print(f"Loss after one update: {loss.item():.4f}")
The above code prints the following output:
Loss after one update: 0.6504
Here is the step-by-step process used in the above example:
- Model Definition: Creates a simple feed-forward network with
Linear
layers and aReLU
activation. - Criterion: Uses
nn.MSELoss
to compute the mean squared error between predictions and targets. - Optimizer Configuration: Sets up RMSProp with a specified learning rate, alpha, and momentum.
- Forward Pass: Feeds input data into the model to obtain output predictions.
- Compute Loss: Calculates how close predictions are to the target.
- Backward Pass: Computes gradients using
loss.backward()
. - Parameter Update: Adjusts model parameters using
optimizer.step()
.
Running this script prints a loss value indicating the training progress on a single batch. Multiple batches and epochs are typically used in practice.
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.