Learn
K-Means Clustering
Implementing K-Means: Step 1

The K-Means algorithm:

1. Place `k` random centroids for the initial clusters.
2. Assign data samples to the nearest centroid.
3. Update centroids based on the above-assigned data samples.

Repeat Steps 2 and 3 until convergence.

After looking at the scatter plot and having a better understanding of the Iris data, let’s start implementing the K-Means algorithm.

In this exercise, we will implement Step 1.

Because we expect there to be three clusters (for the three species of flowers), let’s implement K-Means where the `k` is 3.

Using the NumPy library, we will create three random initial centroids and plot them along with our samples.

### Instructions

1.

First, create a variable named `k` and set it to 3.

2.

Then, use NumPy’s `random.uniform()` function to generate random values in two lists:

• a `centroids_x` list that will have `k` random values between `min(x)` and `max(x)`
• a `centroids_y` list that will have `k` random values between `min(y)` and `max(y)`

The `random.uniform()` function looks like:

``np.random.uniform(low, high, size)``

The `centroids_x` will have the x-values for our initial random centroids and the `centroids_y` will have the y-values for our initial random centroids.

3.

Create an array named `centroids` and use the `zip()` function to add `centroids_x` and `centroids_y` to it.

The `zip()` function looks like:

``np.array(list(zip(array1, array2)))``

Then, print `centroids`.

The `centroids` list should now have all the initial centroids.

4.

Make a scatter plot of `y` vs `x`.

Make a scatter plot of `centroids_y` vs `centroids_x`.

Show the plots to see your centroids!