All design choices impact how a viewer will understand a data visualization. Even the simplest visualizations have an argument, a thesis, or a central point — and the design choices we make (or ignore) can have a positive or negative effect on getting that point across.
For the goal of creating more readable and understandable visualizations, there are some simple, effective tools at our disposal in matplotlib. Here are 6 strategies we’ll learn for making a strong, clear visual argument:
- choose the right chart
- use subplots to compare multiple graphs
- remove distracting lines (i.e., chartjunk)
- use color for emphasis
- add annotations to the graph
- present the graph with context
In this lesson, we’ll work with a dataset that catalogs trees around the Tapajós River, a tributary of the Amazon River that runs through the Amazon Rainforest. Some preliminary data manipulation has been done for you to aggregate and organize the data for our purposes. (This is a crucial step in most data visualization processes, and a great reason to become familiar with the
pandas library! You can check out the other notebook in this folder if you want to see how we organized the data using
pandas.) Use the Jupyter notebook to the right to explore the data, and then we’ll dive into making some visualizations in the next exercise!
Run the Setup cells above to import the necessary packages and load our datasets. Then, in the cell below, type
data.head() and run the cell to preview the first five lines of the full dataset.
avg_heights and run the cell to see the whole
avg_heights dataset and compare the two datasets. What do they have in common, and how are they different?