While scikit-learn contains many existing transformers and classes that can be used in pipelines, you may need at some point to create your own. This is simpler than you may think, as a step in the pipeline needs to have only a few methods implemented. If it is an intermediate step, it will need fit and transform methods, which we will demonstrate in the exercise below.

Here are some of the major takeaways on pipeline:

  • Pipelines help make concise, reproducible, code by combining steps of transformers and/or a final estimator.

  • Intermediate steps of a pipeline must have both the .fit() and .transform() methods. This includes preprocessing, imputation, feature selection, dimension reduction.

  • The final step of a pipeline must have the .fit() method – this can include a transformer or an estimator/model.

  • If the pipeline is meant to only transform your data by combining preprocessing and data cleaning steps, then each step in the pipeline will be a transformer. If your pipeline will also include a model (a final estimation or prediction step), then the last step must be an estimator.

  • Once the steps of a pipeline are defined, it can be used like an other transformer/estimator by calling fit, transform, and/or predict methods. Similarly, it can be used in place of an estimator in a hyperparameter grid search.



Examine the code written for the class MyImputer. This replicates the SimpleImputer using the mean strategy. Notice both fit and transform methods are defined. Use this new class as the first step in new_pipeline and second step StandardScaler.


Fit the new pipeline on the training data, numeric columns only. This will be identical to the pipeline created in exercise 2. Verify this by performing the following steps:

  1. Transform the test set x_test[num_cols] using this and write it to a new variable x_transform.
  2. Calculate the absolute difference between the arrays x_transform and x_test_fill_missing_scale and sum the resulting array. Set this number to a variable array_diff and print it.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?