At this point, we have grouped the Iris plants into 3 clusters. But suppose we didn’t know there are three species of Iris in the dataset, what is the best number of clusters? And how do we determine that?

Before we answer that, we need to define what is a *good* cluster?

Good clustering results in tight clusters, meaning that the samples in each cluster are bunched together. How spread out the clusters are is measured by *inertia*. Inertia is the distance from each sample to the centroid of its cluster. The lower the inertia is, the better our model has done.

You can check the inertia of a model by:

`print(model.inertia_)`

For the Iris dataset, if we graph all the `k`

s (number of clusters) with their inertias:

Notice how the graph keeps decreasing.

Ultimately, this will always be a trade-off. The goal is to have low inertia *and* the least number of clusters.

One of the ways to interpret this graph is to use the *elbow method*: choose an “elbow” in the inertia plot - when inertia begins to decrease more slowly.

In the graph above, 3 is the optimal number of clusters.

### Instructions

**1.**

First, create two lists:

`num_clusters`

that has values from 1, 2, 3, … 8`inertias`

that is empty

**2.**

Then, iterate through `num_clusters`

and calculate K-means for each number of clusters.

Add each of their inertias into the `inertias`

list.

**3.**

Plot the `inertias`

vs `num_clusters`

:

```
plt.plot(num_clusters, inertias, '-o')
plt.xlabel('Number of Clusters (k)')
plt.ylabel('Inertia')
plt.show()
```