Many machine learning algorithms we will learn how to construct from scratch. In the real world, we don't want to recreate a complex algorithm every time we want to use it. Writing an algorithm from scratch is a great way to understand the fundamental principles of why it works. But we may not get the efficiency or reliability we need.

Scikit-learn is a library in Python that provides many unsupervised and supervised learning algorithms. It's built upon some of the technology you already know, like NumPy, pandas, and Matplotlib!

The functionality that scikit-learn provides include:

  • Classification, including K-Nearest Neighbors
  • Regression, including Linear and Logistic Regression
  • Clustering, including K-Means and K-Means++
  • Dimension Analysis
  • Model selection
  • Preprocessing, including Min-Max Normalization

As you move through the material in the Machine Learning unit, you will become familiar with many of these terms. You will also see scikit-learn (in Python, sklearn) modules being used. For example:


is a Linear Regression model inside the linear_model module of sklearn.

The power of scikit-learn will greatly aid your creation of robust Machine Learning programs.

Happy coding!

Made in NYC © 2018 Codecademy