Learn
Now it is your turn!
In this review section, find another dataset from one of the following:
- The scikit-learn library
- UCI Machine Learning Repo
- Codecademy GitHub Repo (coming soon!)
Import the pandas
library as pd
:
import pandas as pd
Load in the data with read_csv()
:
digits = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/optdigits/optdigits.tra", header=None)
Note that if you download the data like this, the data is already split up into a training and a test set, indicated by the extensions .tra and .tes. You’ll need to load in both files.
With the command above, you only load in the training set.
Happy Coding!
Instructions
Implement K-Means clustering on another dataset and see what you can find.
If you think you found something interesting, let us know by posting it on Facebook, Twitter, or Instagram.
Sign up to start coding
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.