Stochastic Gradient Descent
In Sklearn, Stochastic Gradient Descent (SGD) is a popular optimization algorithm that focuses on finding the best set of parameters for a model that minimizes a given loss function.
Unlike traditional gradient descent, which calculates the gradient using the entire dataset, SGD computes the gradient using a single training example at a time. This makes it computationally efficient for large datasets.
Sklearn provides two primary classes for implementing SGD:
SGDClassifier
: Well-suited for classification tasks. Supports various loss functions and penalties for fitting linear classification models.SGDRegressor
: Well-suited for regression tasks. Supports various loss functions and penalties for fitting linear regression models.
Syntax
Following is the syntax for implementing SGD using SGDClassifier
:
from sklearn.linear_model import SGDClassifier
# Create an SGDClassifier model
model = SGDClassifier(loss="hinge", penalty="l2", max_iter=1000, random_state=42)
# Fit the classifier to the training data
model.fit(X_train, y_train)
# Make predictions on the new data
y_pred = model.predict(X_test)
Following is the syntax for implementing SGD using SGDRegressor
:
from sklearn.linear_model import SGDRegressor
# Create an SGDRegressor model
model = SGDRegressor(loss="squared_loss", penalty="l2", max_iter=1000, random_state=42)
# Fit the regressor to the training data
model.fit(X_train, y_train)
# Make predictions on the new data
y_pred = model.predict(X_test)
loss
: Specifies the loss function.- For
SGDClassifier
, the options includehinge
(default),log
, andmodified_huber
. - For
SGDRegressor
, the options includesquared_loss
(default),huber
, andepsilon_insensitive
.
- For
penalty
: Specifies the regularization penalty. Common options includel2
(L2 regularization, default),l1
(L1 regularization), andelasticnet
(a combination of L1 and L2 regularization).max_iter
: Specifies the maximum number of iterations for the optimization algorithm. The default value is1000
. Excessive values can lead to overfitting or unnecessary computations.random_state
: Specifies the random seed for reproducibility. The default value isNone
. Settingrandom_state
ensures consistent results across runs by fixing the randomness of data splitting or model initialization.
Example
The following example demonstrates the implementation of SGD using SGDClassifier
:
from sklearn.datasets import load_irisfrom sklearn.linear_model import SGDClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import accuracy_score# Load the Iris datasetiris = load_iris()X = iris.datay = iris.target# Create training and testing sets by splitting the datasetX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)# Create an SGDClassifier modelmodel = SGDClassifier(loss="hinge", penalty="l2", max_iter=1000, random_state=42)# Fit the classifier to the training datamodel.fit(X_train, y_train)# Make predictions on the new datay_pred = model.predict(X_test)# Evaluate the model's accuracyaccuracy = accuracy_score(y_test, y_pred)print("Accuracy:", accuracy)
The above code produces the following output:
Accuracy: 0.8
Codebyte Example
The following codebyte example demonstrates the implementation of SGD using SGDRegressor
:
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python:Sklearn on Codecademy
- Career path
Computer Science
Looking for an introduction to the theory behind programming? Master Python while learning data structures, algorithms, and more!Includes 6 CoursesWith Professional CertificationBeginner Friendly75 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours