Learn

In data visualization, it’s often helpful to compare multiple simple visualizations. As we saw in the last exercise, there were many Spotify categories to compare across genres, and viewing them as individual scatterplots definitely made more sense than trying to fit them all into the same graph.

Instead of scrolling between graphs, matplotlib also allows us to position plots in a grid using the general function for subplots: plt.subplot(). It takes the following parameters:

  • num_rows: number of rows in the grid
  • num_columns: number of columns in the grid
  • index: the numbered position of the subplot, reading the grid from left-to-right, top-to-bottom

Organizationally, it’s helpful to think of the plt.subplot() function as an introduction to the graph function and any general functions (like title()) that follow it, until the next subplot is introduced.

Additionally, to add a single title at the top of all the subplots, we use the general function plt.suptitle() to add a super title.

To make a grid of 6 scatterplots, in two rows and three columns, with the plots colored in rainbow order, we could use the following code:

plt.suptitle('Rainbow Scatterplots') plt.subplot(2, 3, 1) plt.scatter(x, y, color = red) plt.subplot(2, 3, 2) plt.scatter(x, y, color = orange) plt.subplot(2, 3, 3) plt.scatter(x, y, color = yellow) plt.subplot(2, 3, 4) plt.scatter(x, y, color = green) plt.subplot(2, 3, 5) plt.scatter(x, y, color = blue) plt.subplot(2, 3, 6) plt.scatter(x, y, color = purple)

Notice how all these subplots have the same first two numbers, (2, 3, __), since they’re all part of the same 2-by-3 grid. Only the index number changes, indicating which subplot is being created. The grid would appear in this order:

A grid of 6 identical scatterplots in 2 rows and 3 columns. From top left to bottom right, the plots are colored red, orange, yellow, green, blue and purple.

These scatterplots all use the same data, so there’s not much to compare. In this exercise, we’ll use real data about the characteristics of Spotify genres: our goal will be to see if any of the variables have a correlation with the popularity variable.

Instructions

1.

First, run the Setup cells above to load in the necessary packages and the spotify_data_by_genres csv. Our goal for this exercise is to see if any of the variables have a correlation with the popularity variable. In this step, write the code to make an “empty” subplot grid with 3 rows and 4 columns. We won’t put any graphs in the grid yet, so your code will use only the subplot function.

2.

Now, add a line of code below each subplot to make a scatterplot in each square of the grid. Each scatterplot should have popularity as the x-variable and then the following y-variables: acousticness, danceability, duration_ms, energy, instrumentalness, liveness, loudness, speechiness, tempo, valence, key, and popularity. (Yes, the final square will plot popularity against itself!) Set the alpha equal to 0.05.

3.

This gives us something to work with! These correlations are a bit all over the place, so it’s helpful that we can easily compare their differences. Add a title above each graph using an instance of plt.title() for each subplot.

4.

Finally, we’ll add our super title. Above the first subplot, add a line of code to make the super title: ‘Relationship between Popularity and __‘. (Copy and paste the title from here to make sure you get the correct number of underscores!)

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?