Machine Learning Pipelines
Lesson 1 of 1
  1. 1
    In this lesson we’re going to learn how to turn a machine learning (ML) workflow to a pipeline using scikit-learn. A ML pipeline is a modular sequence of objects that codifies and automates a ML wo…
  2. 2
    To introduce pipelines, let’s look at a common set of data cleaning/EDA tasks — dealing with missing values and scaling numeric variables. We’re going to convert an existing code base that pe…
  3. 3
    We’re now going to implement a task similar to the previous exercise with pipeline.Pipeline(), but with categorical variables now. Specifically we’ll be dealing with missing values in categorical d…
  4. 4
    Often times, you may not want to simply apply every function to all columns. If our columns are of different types, we may only want to apply certain parts of the pipeline to a subset of columns. …
  5. 5
    Great! Now that we have all the preprocessing done and coded succinctly using ColumnTransformer and Pipeline, we can add a model. We will take the result at the end of the previous exercise, and …
  6. 6
    Great, we have a very condensed bit of code that does all our data cleaning, preprocessing, and modeling in a reusable fashion! What now? Well, we can tune some of the parameters of the model by …
  7. 7
    Way to go! Now that we are getting the hang of pipelines, we’re going take things up a notch. We will now be searching over different types of models, each having their own sets of hyperparameters!…
  8. 8
    While scikit-learn contains many existing transformers and classes that can be used in pipelines, you may need at some point to create your own. This is simpler than you may think, as a step in th…

What you'll create

Portfolio projects that showcase your new skills

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory