Not only can subplots help us fit graphs nicely onto a page, the arrangement of graphs in a subplot can also be used as a tool to make a clearer visual argument.
Side-by-side charts allow us to quickly compare y-axis (vertical) changes. This is often the best presentation for direct comparisons of similar charts, especially when the x-axis is consistent.
Stacked charts allow for easy comparison of changes on the x-axis (horizontal).
A tiled grid creates a setup known as “small multiples”. This is ideal for showing a pattern between multiple visualizations, or emphasizing one visualization in the context of others.
With any setup, we should keep the scale and axis-bounds the same whenever possible so that the graphs can be compared directly. If that’s not possible, we need to make sure the viewer knows what’s being changed between each graph. Graphs can become confusing or even misleading if the scale, axes, or units are changed without explicit notice. Titles, annotations, or axis-labels are good places to put this information.
To make these subplot arrangements, we use the
plt.subplot() function, which takes parameters for
index (i.e. position in the grid). For example, the code below makes a grid with 4 rows and 2 columns (so, 8 squares), and will “select” the sixth square in the grid:
plt.subplot(4, 2, 6) ## code for plot #6 goes here
Let’s put some of these to work on our dataset now. Again, we’ll load the tree inventory dataset, and subsets of relevant information. This time, the trees are subsetted by forest type:
PF (primary forest),
SF (secondary forest), and
SLF (selectively logged forest). We’ll use multiple vertical bar charts to compare the counts of the most common trees in each forest type. Our goal for the next few visualizations is to better understand biodiversity in the three different types of forest.
Run the Setup cells above to load our data, and take a few minutes to look at the subsets. Then, make 3 subplots in a 3-row by 1-column grid. Make a barplot in each subplot, plotting
genus on the x-axis and
counts on the y-axis for each type of forest. Add the following
xlabels to each x-axis so we can keep track of which graph represents which forest: Primary Forest, Secondary Forest, and Selectively Logged Forest.
Okay, that’s looking pretty hard to read right now – we’ll work on fixing up this graph throughout this exercise and the next one. To start, let’s standardize the y-axes. The counts in the Secondary Forest graph are near 100, while the other two reach only to about 70. Change the y-axes range to
(0,105) for each graph using
Looking better already! But do you notice that the bars are slightly wider in the middle graph? Let’s standardize the x-axis too, by setting the
(-3, 105) to give a cushion on either side of the graph.
Now we have a real visual comparison going – the number of trees in the top genuses are clearly higher in Secondary Forests, but it looks like Primary and Selectively Logged Forests have more species of trees. Let’s clean it up a little more by rotating the labels. Recall that we can use matplotlib’s general function
plt.xticks(), and pass in arguments for horizontal alignment,
rotation. Set the horizontal alignment equal to
"left" and the rotation equal to