What is Scikit-Learn?

Codecademy Team
Open-source ML library for Python. Built on NumPy, SciPy, and Matplotlib.

In this course, we will learn how to construct various machine learning algorithms from scratch. In the real world, however, we don’t want to recreate a complex algorithm every time we want to use it. Writing an algorithm from scratch is a great way to understand the fundamental principles of why it works, but we may not get the efficiency or reliability we need.

Scikit-learn is a library in Python that provides many unsupervised and supervised learning algorithms. It’s built upon some of the technology you might already be familiar with, like NumPy, pandas, and Matplotlib!

The functionality that scikit-learn provides include:

  • Regression, including Linear and Logistic Regression
  • Classification, including K-Nearest Neighbors
  • Clustering, including K-Means and K-Means++
  • Model selection
  • Preprocessing, including Min-Max Normalization

As you move through Codecademy’s Machine Learning content, you will become familiar with many of these terms. You will also see scikit-learn (in Python, sklearn) modules being used. For example:


is a Linear Regression model inside the linear_model module of sklearn.

The power of scikit-learn will greatly aid your creation of robust Machine Learning programs.

Robot Emoji

Happy Coding!