Often times, you may not want to simply apply every function to all columns. If our columns are of different types, we may only want to apply certain parts of the pipeline to a subset of columns. This is what we saw in the two previous exercises. One set of transformations are applied to numeric columns and another set to the categorical ones. We can use ColumnTransformer as one way of combining these processes together.

ColumnTransformer takes in a list of tuples of the form (name, transformer, columns). The transformer can be anything with a .fit and .transform method like we used previously (like SimpleImputer or StandardScaler), but can also itself be a pipeline, as we will use in the exercise.



Create a pipeline for the numerical preprocessing and a separate pipeline for the categorical preprocessing (see previous two exercises), called num_vals and cat_vals.


Create a ColumnTransformer named preprocess that takes the previous two pipelines and passes the numeric and categorical variables to each, respectively.


Fit the transformer on the training set and transform the test data.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?