.addcdiv()
The .addcdiv()
function performs element-wise division of tensors followed by scalar multiplication and addition to another tensor. It returns a new tensor with the formula: input + value * (tensor1 / tensor2)
. This operation is commonly used in optimization algorithms, particularly in the implementation of adaptive learning rate methods like Adam, RMSprop, and Adagrad.
In deep learning frameworks, efficient tensor operations are crucial for performance. The .addcdiv()
function provides a convenient way to perform multiple arithmetic operations in a single call, which can help reduce computational overhead and improve code readability in complex neural network implementations.
Syntax
torch.addcdiv(input, tensor1, tensor2, *, value=1, out=None)
Parameters:
input
: The base tensor to which the scaled division result is added.tensor1
: The numerator tensor in the division operation.tensor2
: The denominator tensor in the division operation.value
(Optional): A scalar multiplier for the division result (default:1
).out
(Optional): The output tensor to store the result.
Return value:
Returns a new tensor with the result of the operation unless out
is specified.
Example 1: Basic Usage of .addcdiv()
with tensors
Let’s start with a basic example to understand how .addcdiv()
works with PyTorch tensors:
import torch# Create input tensorsinput_tensor = torch.tensor([1.0, 2.0, 3.0, 4.0])tensor1 = torch.tensor([2.0, 4.0, 6.0, 8.0])tensor2 = torch.tensor([1.0, 2.0, 2.0, 4.0])# Apply addcdiv with default value=1result = torch.addcdiv(input_tensor, tensor1, tensor2)# Print resultsprint("Input tensor:", input_tensor)print("Tensor1 (numerator):", tensor1)print("Tensor2 (denominator):", tensor2)print("Result:", result)
This example results in the following output:
Input tensor: tensor([1., 2., 3., 4.])Tensor1 (numerator): tensor([2., 4., 6., 8.])Tensor2 (denominator): tensor([1., 2., 2., 4.])Result: tensor([3., 4., 6., 6.])
The .addcdiv()
operation computes input + tensor1/tensor2
. For each element, it divides the corresponding values in tensor1
by tensor2
, then adds this result to the input_tensor
.
Example 2: Optimizing Neural Networks with .addcdiv()
The .addcdiv()
function is particularly useful in optimization algorithms for neural networks. Here’s an example of implementing a simplified version of the RMSprop optimizer, which uses this operation:
import torchimport torch.nn as nn# Create a simple neural networkmodel = nn.Sequential(nn.Linear(10, 5),nn.ReLU(),nn.Linear(5, 2))# Initialize parameterslearning_rate = 0.01decay_rate = 0.99epsilon = 1e-8gradient_squared = [torch.zeros_like(param) for param in model.parameters()]# Sample input for demonstrationx = torch.randn(1, 10)target = torch.tensor([1])# Forward passoutput = model(x)loss_fn = nn.CrossEntropyLoss()loss = loss_fn(output, target)# Backward passloss.backward()# RMSprop update using addcdivfor i, param in enumerate(model.parameters()):if param.grad is not None:# Update accumulated squared gradientgradient_squared[i] = decay_rate * gradient_squared[i] + (1 - decay_rate) * param.grad.pow(2)# Update parameters using addcdiv# formula: param = param - learning_rate * grad / sqrt(gradient_squared + epsilon)param.data = torch.addcdiv(param.data, param.grad, gradient_squared[i].sqrt() + epsilon, value=-learning_rate)print("Parameters updated using addcdiv in RMSprop optimizer")
This example shows how .addcdiv()
can be used to implement the RMSprop optimization algorithm, which divides the gradient by the square root of a running average of squared gradients to adapt the learning rate for each parameter.
Example 3: Custom Normalization with .addcdiv()
.addcdiv()
can also be used for custom normalization techniques. Here’s an example of implementing a form of layer normalization using this operation:
import torch# Create a batch of feature vectors (batch_size=3, feature_dim=4)features = torch.tensor([[2.0, 4.0, 1.0, 3.0],[6.0, 2.0, 5.0, 1.0],[3.0, 5.0, 2.0, 4.0]])# Calculate mean and standard deviation along the feature dimensionmean = features.mean(dim=1, keepdim=True) # Shape: (3, 1)std = features.std(dim=1, keepdim=True) + 1e-5 # Shape: (3, 1)# Create a scale and shift parameter (like in BatchNorm/LayerNorm)scale = torch.tensor([1.0, 0.5, 2.0, 1.5]).unsqueeze(0) # Shape: (1, 4)shift = torch.tensor([0.0, 0.0, 0.0, 0.0]).unsqueeze(0) # Shape: (1, 4)# Normalize using addcdiv# Formula: normalized = shift + scale * (features - mean) / stdnormalized = torch.addcdiv(shift, features - mean, std, value=1.0) * scale # Apply scaling separatelyprint("Original features:")print(features)print("\nNormalized features:")print(normalized)
This example results in the following output:
Original features:tensor([[2., 4., 1., 3.],[6., 2., 5., 1.],[3., 5., 2., 4.]])Normalized features:tensor([[-0.3873, 0.5809, -2.3238, 0.5809],[ 1.0502, -0.3151, 1.2602, -1.5753],[-0.3873, 0.5809, -2.3238, 0.5809]])
This example demonstrates how .addcdiv()
can be used to implement custom normalization techniques, making the code more concise and efficient compared to separate tensor operations.
To learn more about how to create, train, and test neural networks, check out this Intro to PyTorch and Neural Networks course on Codecademy.
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn PyTorch on Codecademy
- Skill path
Build a Machine Learning Model
Learn to build machine learning models with Python.Includes 10 CoursesWith CertificateBeginner Friendly23 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours