Great, we have a very condensed bit of code that does all our data cleaning, preprocessing, and modeling in a reusable fashion! What now? Well, we can tune some of the parameters of the model by apply a grid search over a range of hyperparameter values.

A linear regression model has very few hyperparameters, really just whether we include in intercept. But we will use this as an example to see the process for a pipeline. The pipeline created in the previous exercise is, itself, an estimator – you can call .fit and .predict on it. So in fact, the pipeline can be passed as an estimator for GridSearchCV. This will then refit the pipeline for each combination of parameter values in the grid and each fold in the cross-validation split.

That’s a lot – but the code is again very short. One thing to keep in mind, to reference hyperparameters in a pipeline, the values are reference by the pipeline step name + ‘‘ + hyperparameter. So `regrfit_intercept` references the named pipeline step “regr” and the hyperparameter “fit_intercept”.



Use the previous built pipeline as input to GridSearchCV using scoring='neg_mean_squared_error' and cv=5 and fit on the training data.


Print the best score obtained from the cross-validated grid search.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?