Congratulations! You just learned how a logistic regression model works and how to fit one to a dataset. Here are some of the things you learned:

  • Logistic regression is used to perform binary classification.
  • Logistic regression is an extension of linear regression where we use a logit link function to fit a sigmoid curve to the data, rather than a line.
  • We can use the coefficients from a logistic regression model to estimate the log odds that a datapoint belongs to the positive class. We can then transform the log odds into a probability.
  • The coefficients of a logistic regression model can be used to estimate relative feature importance.
  • A classification threshold is used to determine the probabilistic cutoff for where a data sample is classified as belonging to a positive or negative class. The default cutoff in sklearn is 0.5.
  • We can evaluate a logistic regression model using a confusion matrix or summary statistics such as accuracy, precision, recall, and F1 score.


Find another dataset for binary classification from Kaggle or take a look at sklearn‘s breast cancer dataset. Use sklearn to build your own Logistic Regression model on the data and make some predictions. Which features are most important in the model you build?

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?