Scikit-Learn Cheatsheet
Scikit-learn is a library in Python that provides many unsupervised and supervised learning algorithms. It’s built upon some of the technology you might already be familiar with, like NumPy, pandas, and Matplotlib!
As you build robust Machine Learning programs, it’s helpful to have all the sklearn
commands all in one place in case you forget.
Linear regression in Scikit-Learn
Import and create the model:
from sklearn.linear_model import LinearRegressionyour_model = LinearRegression()
Fit:
your_model.fit(x_training_data, y_training_data)
.coef_
: contains the coefficients.intercept_
: contains the intercept
Predict:
predictions = your_model.predict(your_x_data)
.score()
: returns the coefficient of determination R²
Build a Machine Learning Model
Learn to build machine learning models with Python.Try it for freeNaive Bayes classification in Scikit-Learn
Import and create the model:
from sklearn.naive_bayes import MultinomialNByour_model = MultinomialNB()
Fit:
your_model.fit(x_training_data, y_training_data)
Predict:
# Returns a list of predicted classes - one prediction for every data pointpredictions = your_model.predict(your_x_data)# For every data point, returns a list of probabilities of each classprobabilities = your_model.predict_proba(your_x_data)
K-nearest neighbors (KNN) in Scikit-Learn
Import and create the model:
from sklearn.neighbors import KNeighborsClassifieryour_model = KNeighborsClassifier()
Fit:
your_model.fit(x_training_data, y_training_data)
Predict:
# Returns a list of predicted classes - one prediction for every data pointpredictions = your_model.predict(your_x_data)# For every data point, returns a list of probabilities of each classprobabilities = your_model.predict_proba(your_x_data)
K-means clustering in Scikit-Learn
Import and create the model:
from sklearn.cluster import KMeansyour_model = KMeans(n_clusters=4, init='random')
n_clusters
: number of clusters to form and number of centroids to generateinit
: method for initializationk-means++
: K-Means++ [default]random
: K-Means
random_state
: the seed used by the random number generator [optional]
Fit:
your_model.fit(x_training_data)
Predict:
predictions = your_model.predict(your_x_data)
Validating a machine learning model
Import and print accuracy, recall, precision, and F1 score:
from sklearn.metrics import accuracy_score, recall_score, precision_score, f1_scoreprint(accuracy_score(true_labels, guesses))print(recall_score(true_labels, guesses))print(precision_score(true_labels, guesses))print(f1_score(true_labels, guesses))
Import and print the confusion matrix:
from sklearn.metrics import confusion_matrixprint(confusion_matrix(true_labels, guesses))
Splitting data into training and test sets
from sklearn.model_selection import train_test_splitx_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.8, test_size=0.2)
train_size
: the proportion of the dataset to include in the train splittest_size
: the proportion of the dataset to include in the test splitrandom_state
: the seed used by the random number generator [optional]
Conclusion
Scikit-Learn provides a powerful and user-friendly framework for implementing machine learning models in Python. From regression and classification to clustering and model validation, it simplifies complex tasks with efficient built-in functions. Whether you’re training models, making predictions, or evaluating performance, Scikit-Learn equips you with the tools needed to build robust machine learning applications.
If you’re interested in learning more about Scikit-Learn and its applications in machine learning, please check out our AI Catalog of articles!

Happy Coding!
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
Learn more on Codecademy
- Skill path
Build a Machine Learning Model
Learn to build machine learning models with Python.Includes 10 CoursesWith CertificateBeginner Friendly23 hours - Free course
Machine Learning: Clustering with K-Means
Level up your machine learning skills by using unsupervised learning to find patterns hidden in data.Beginner Friendly2 hours - Free course
Machine Learning: Introduction with Regression
Get started with machine learning and learn how to build, implement, and evaluate linear regression models.Beginner Friendly3 hours
- <a href="https://www.codecademy.com/learn/linear-regression-mssp" target="_blank">Linear regression in Scikit-Learn</a>
- <a href="https://www.codecademy.com/resources/docs/sklearn/naive-bayes" target="_blank">Naive Bayes classification in Scikit-Learn</a>
- <a href="https://www.codecademy.com/learn/machine-learning-k-nearest-neighbors" target="_blank">K-nearest neighbors (KNN) in Scikit-Learn</a>
- <a href="https://www.codecademy.com/learn/machine-learning-clustering-with-k-means" target="_blank">K-means clustering in Scikit-Learn</a>
- <a href="http://scikit-learn.org/stable/modules/classes.html#sklearn-metrics-metrics" target="_blank">Validating a machine learning model </a>
- <a href="http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html" target="_blank">Splitting data into training and test sets</a>
- Conclusion