Text Preprocessing
Text is everywhere, and knowing how to clean it will transform your data science skillset. Many in the industry estimate that 80% of data science is data cleaning, including text preprocessing. Transforming text into usable data requires specialized tools and techniques. This course introduces text cleaning with Python 3 using regular expressions (regex) and NLTK.
Codecademy courses have been taken by employees at
- 1Get a taste of regular expressions (regex), a powerful search pattern language to quickly find the text you’re looking for.
- 2Before most natural language processing tasks, it’s necessary to clean up the text data using text preprocessing techniques.
How you'll master it
Stress-test your knowledge with quizzes that help commit syntax to memory

— Madelyn, Pinterest
I know from first-hand experience that you can go in knowing zero, nothing, and just get a grasp on everything as you go and start building right away.
Course Description
Learn to clean text with Python 3 using regular expressions (regex) and NLTK.
Details
Earn a certificate of completion
2 hours to complete in total
Intermediate
Get a taste of regular expressions (regex), a powerful search pattern language to quickly find the text you’re looking for.
1 lesson, 1 external resource, 1 article, 1 quiz