Great! We have looked at a number of different methods we may use to get data into the format we want for analysis.

Specifically, we have covered:

  • diagnosing the “tidiness” of the data
  • reshaping the data
  • combining multiple files
  • changing the types of values
  • dropping or filling missing values - how we deal with data that is incomplete or missing
  • manipulating strings to represent the data better

You can use these methods to transform your datasets to be clean and easy to work with!


Congratulations! You have some good data cleaning tools in your toolkit.

If you’d like, go ahead and test out your skills on a dataset that interests you at Kaggle. What kinds of insights can you make?

