Gaussian Processes
Gaussian Processes are a supervised learning framework that predicts outcomes as distributions, assuming any set of input points follows a joint Gaussian distribution.
They are beneficial for modeling complex relationships and estimating the confidence of predictions.
Gaussian Processes are implemented in Scikit-learn via GaussianProcessRegressor
and GaussianProcessClassifier
.
Syntax
Scikit-learn provides the general sklearn.gaussian_process
class for implementing all the essential Gaussian Processes.
Under this class, you can use the GaussianProcessRegressor
class for regression tasks, and the GaussianProcessClassifier
class for classification tasks.
Here is the basic syntax for using them:
GaussianProcessRegressor
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
# Define the Gaussian Process Regressor
gp_regressor = GaussianProcessRegressor(kernel=RBF(), alpha=1e-10)
kernel
: Defines the covariance function of the Gaussian Process. For example,RBF()
represents a radial basis function kernel, which measures similarity between points.alpha
: Adds noise to the diagonal of the covariance matrix, useful for handling numerical stability during regression.
GaussianProcessClassifier
from sklearn.gaussian_process import GaussianProcessClassifier
# Define the Gaussian Process Classifier
gp_classifier = GaussianProcessClassifier(kernel=None, n_restarts_optimizer=10)
kernel
: Defines the covariance function of the Gaussian Process. If set toNone
, the classifier uses the default radial basis function (RBF) kernel.n_restarts_optimizer
: Specifies the number of restarts for the optimizer when finding the best hyperparameters. Increasing this value may improve performance at the cost of computation time.
Example
This example demonstrates using GaussianProcessRegressor
to model a sine function:
import numpy as npfrom sklearn.gaussian_process import GaussianProcessRegressorfrom sklearn.gaussian_process.kernels import RBF# Generate sample dataX = np.linspace(0, 10, 100).reshape(-1, 1)y = np.sin(X).ravel() + 0.1 * np.random.normal(size=100) # Add noise# Define the kernelkernel = RBF(length_scale=1.0)# Train the Gaussian Process Regressorgp = GaussianProcessRegressor(kernel=kernel, alpha=0.1)gp.fit(X, y)# Make predictionsX_test = np.linspace(0, 10, 100).reshape(-1, 1)y_pred, sigma = gp.predict(X_test, return_std=True)print("First 5 Predictions:", y_pred[:5])print("First 5 Uncertainties (std_dev):", sigma[:5])
This example results in the following output:
First 5 Predictions: [ 0.053 0.087 0.119 0.148 0.175]First 5 Uncertainties (std_dev): [0.1 0.1 0.1 0.1 0.1]
Codebyte Example
Try this example to experiment with GaussianProcessRegressor
for modeling data:
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.
Learn Python:Sklearn on Codecademy
- Career path
Computer Science
Looking for an introduction to the theory behind programming? Master Python while learning data structures, algorithms, and more!Includes 6 CoursesWith Professional CertificationBeginner Friendly75 hours - Course
Learn Python 3
Learn the basics of Python 3.12, one of the most powerful, versatile, and in-demand programming languages today.With CertificateBeginner Friendly23 hours