Text Preprocessingpro-logo


Text is everywhere, and knowing how to clean it will transform your data science skillset. Many in the industry estimate that 80% of data science is data cleaning, including text preprocessing. Transforming text into usable data requires specialized tools and techniques. This course introduces text cleaning with Python 3 using regular expressions (regex) and NLTK.

Codecademy courses have been taken by employees at

Google LogoFacebook LogoNASA LogoIBM LogoDropbox Logo
  1. 1
    Get a taste of regular expressions (regex), a powerful search pattern language to quickly find the text you’re looking for.
  2. 2
    Before most natural language processing tasks, it’s necessary to clean up the text data using text preprocessing techniques.

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo
Madelyn from Pinterest
I know from first-hand experience that you can go in knowing zero, nothing, and just get a grasp on everything as you go and start building right away.
— Madelyn, Pinterest