Adam
Adam (Adaptive Moment Estimation) is a popular optimization algorithm used to train neural networks in PyTorch. It incorporates the benefits of AdaGrad and RMSProp algorithms, making it effective for handling sparse gradients and optimizing non-stationary objectives. Adam uses adaptive learning rates for each parameter and incorporates momentum to accelerate convergence.
Syntax
In PyTorch, the torch.optim.Adam
class is used to implement the Adam optimizer:
torch.optim.Adam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-8, weight_decay=0, amsgrad=False, ...)
params
: The iterable of parameters to optimize (typicallymodel.parameters()
).lr
(Optional): The learning rate (default is0.001
).betas
(Optional): The coefficients used for computing running averages of gradient and squared gradient (default is(0.9, 0.999)
).eps
(Optional): The value added to the denominator to prevent division by zero (default is1e-8
).weight_decay
(Optional): The L2 regularization factor (default is0
).amsgrad
(Optional): IfTrue
, uses the AMSGrad variant of the Adam optimizer (default isFalse
).
Note: The ellipsis (
...
) indicates that there can be additional optional parameters beyond those listed here, depending on specific use cases.
Example
The following example demonstrates the usage of the Adam optimizer to train a neural network in PyTorch:
import torchimport torch.nn as nnimport torch.optim as optim# Define the modelmodel = nn.Sequential(nn.Linear(1, 1) # Single linear layer)# Define the loss function (MSELoss) and Adam optimizercriterion = nn.MSELoss()optimizer = optim.Adam(model.parameters(), lr=0.01)# Sample training loopfor epoch in range(50):# Sample input and targetx = torch.randn(5, 1) # 5 samples, 1 feature eachy = 2*x + 1 # Linear relationship# Forward passoutput = model(x)loss = criterion(output, y)# Backward pass and optimizationoptimizer.zero_grad()loss.backward()optimizer.step()# Print loss every 10 epochsif epoch % 10 == 0:print(f'Epoch [{epoch}/100], Loss: {loss.item():.4f}')
A sample output from the training loop might look like this (actual values may vary due to randomness):
Epoch [0/100], Loss: 0.4287Epoch [10/100], Loss: 9.2974Epoch [20/100], Loss: 3.8182Epoch [30/100], Loss: 1.9353Epoch [40/100], Loss: 14.0286
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn PyTorch on Codecademy
- Career path
Data Scientist: Machine Learning Specialist
Machine Learning Data Scientists solve problems at scale, make predictions, find patterns, and more! They use Python, SQL, and algorithms.Includes 27 CoursesWith Professional CertificationBeginner Friendly95 hours - Free course
Intro to PyTorch and Neural Networks
Learn how to use PyTorch to build, train, and test artificial neural networks in this course.Intermediate3 hours