Loss Functions
Loss Functions, also known as cost functions, represent essential components that quantify the error or difference between a neural network’s predictions and the target values in PyTorch. They compute a scalar loss value, which optimization algorithms use to adjust the model’s weights during training, improving its performance over time.
PyTorch’s nn
(neural network) module provides a variety of built-in loss functions designed for different tasks, such as regression and classification.
Syntax
The general syntax for using a loss function in PyTorch is:
import torch.nn as nn
# Define the loss function.
criterion = nn.LossFunctionName(*args, **kwargs)
# Compute the loss.
loss = criterion(predicted_outputs, target_values)
- Replace
LossFunctionName
with a specific function likeMSELoss
for regression orCrossEntropyLoss
for classification. - Ensure that the shapes and types of
predicted_outputs
andtarget_values
meet the loss function’s requirements. For instance:- For
nn.CrossEntropyLoss
,predicted_outputs
should contain raw scores (logits), andtarget_values
should hold class indices. - For
nn.BCEWithLogitsLoss
, the function expects raw scores, applying the Sigmoid function internally.
- For
- Most loss functions include a
reduction
parameter (mean
,sum
, ornone
) that specifies how to aggregate the loss.
These elements allow optimization algorithms to compute a scalar loss value, adjust model weights, and improve performance over iterations.
Example 1
Mean Squared Error Loss (nn.MSELoss
) for Regression:
import torchimport torch.nn as nnimport torch.optim as optim# Hyperparametersinput_size = 10output_size = 1learning_rate = 0.01batch_size = 32# Sample datainputs = torch.randn(batch_size, input_size)targets = torch.randn(batch_size, output_size)# Define a simple linear regression modelmodel = nn.Linear(input_size, output_size)# Define the loss functioncriterion = nn.MSELoss()# Define the optimizeroptimizer = optim.SGD(model.parameters(), lr=learning_rate)# Forward passoutputs = model(inputs)# Compute Lossloss = criterion(outputs, targets)print('Initial Loss:', loss.item())# Backward pass and optimizationoptimizer.zero_grad()loss.backward()optimizer.step()# Forward pass after optimizationoutputs = model(inputs)loss = criterion(outputs, targets)print('Loss after one optimization step:', loss.item())
The output for this example will be as follows:
Initial Loss: 1.0875437259674072Loss after one optimiation step: 1.0775121450424194
Example 2
Cross Entropy Loss (nn.CrossEntropyLoss
) for Classification:
import torchimport torch.nn as nnimport torch.optim as optim# Hyperparametersinput_size = 784 # e.g., 28x28 images flattenedhidden_size = 128num_classes = 10learning_rate = 0.001batch_size = 64# Sample datainputs = torch.randn(batch_size, input_size)targets = torch.randint(0, num_classes, (batch_size,)) # Class labels# Define a simple neural networkmodel = nn.Sequential(nn.Linear(input_size, hidden_size),nn.ReLU(),nn.Linear(hidden_size, num_classes))# Define the loss functioncriterion = nn.CrossEntropyLoss()# Define the optimizeroptimizer = optim.Adam(model.parameters(), lr=learning_rate)# Forward passoutputs = model(inputs)# Compute lossloss = criterion(outputs, targets)print('Initial Loss:', loss.item())# Backward pass and optimizationoptimizer.zero_grad()loss.backward()optimizer.step()# Forward pass after optimizationoutputs = model(inputs)loss = criterion(outputs, targets)print('Loss after one optimization step:', loss.item())
The output for this example will be as follows:
Initial Loss: 2.3014121055603027Loss after one optimization step: 2.294567823410034
Example 3
Binary Cross Entropy Loss (nn.BCEWithLogitsLoss
) for Binary Classification:
import torchimport torch.nn as nnimport torch.optim as optim# Hyperparametersinput_size = 20hidden_size = 16learning_rate = 0.005batch_size = 16# Sample datainputs = torch.randn(batch_size, input_size)targets = torch.randint(0, 2, (batch_size, 1)).float() # Binary targets (0 or 1)# Define a simple neural networkmodel = nn.Sequential(nn.Linear(input_size, hidden_size),nn.ReLU(),nn.Linear(hidden_size, 1))# Define the loss functioncriterion = nn.BCEWithLogitsLoss()# Define the optimizeroptimizer = optim.SGD(model.parameters(), lr=learning_rate)# Forward passoutputs = model(inputs)# Compute lossloss = criterion(outputs, targets)print('Initial Loss:', loss.item())# Backward pass and optimizationoptimizer.zero_grad()loss.backward()optimizer.step()# Forward pass after optimizationoutputs = model(inputs)loss = criterion(outputs, targets)print('Loss after one optimization step:', loss.item())
The output for this example will be as follows:
Initial Loss: 0.6937122344970703Loss after one optimization step: 0.6932134628295898
Note: When running these examples, the exact numerical values of the losses will vary each time due to random initialization of model weights and input data. The important aspect is observing the trend of the loss decreasing after the optimization step, indicating that the model is learning.
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn PyTorch on Codecademy
- Skill path
Build a Machine Learning Model
Learn to build machine learning models with Python.Includes 10 CoursesWith CertificateBeginner Friendly23 hours - Free course
Intro to PyTorch and Neural Networks
Learn how to use PyTorch to build, train, and test artificial neural networks in this course.Intermediate3 hours - Course
PyTorch for Classification
Build AI classification models with PyTorch using binary and multi-label techniques.With CertificateBeginner Friendly3 hours