Avoiding Misleading Data Visualizations
Apr 01, 2018The point of data communication is to make the information it represents coherent and readily understood. Data visualization is an effective way to accomplish that. Misleading or confusing visualizations not only fail to get their point across, but risk that the data will be misinterpreted or not acted upon in meaningful ways. In this video, you'll learn how to avoid misleading data visualizations.
The point of data communication is to make the information it represents coherent and readily understood. Data visualization is an effective way to accomplish that. Misleading or confusing visualizations not only fail to get their point across, but risk that the data will be misinterpreted or not acted upon in meaningful ways.
To start, human brains are not well suited to dealing with scale. It's hard to fully comprehend big numbers. Therefore, start the scale at zero. Changes in scale are open to vastly different interpretations of data, and the start value of an axis has a big impact on the relative value of data. Suppose we have two graphs displaying interest rate information. On one, the values on the left or y-axis indicate small, incremental values, making it seem as if the interest rate is increasing drastically. In reality, there's scarcely a point difference. A different graph, scaled starting at 0 on the y-axis, makes this much more obvious.
Next, avoid using cumulative graphs. Many organizations use these to exaggerate performance in terms of product sales or overall revenue. Whenever there's a steady upward climb in a graph, it's automatically assumed things are going well. However, presenting the same information in a non-cumulative format may reveal the opposite to be true.
Be sure to follow standard graphing conventions. For example, don't use upside down y-axes, where doing so suggests a different result than actually exists. Putting data into non-standard formats can be misleading at best and dishonest at worst.
Finally, don't omit data unnecessarily. When specific data points are ignored, it creates a trend or pattern that doesn't reflect the actual data. This can result in incorrect conclusions. Let's say we have two graphs of stock market trends. Companies can readily fool investors into thinking the market is steady by showing them a graph which omits every second year to display a steady increase. Accuracy and honesty require showing a graph, which includes data points for every year reflecting the actual volatility of the market.
Consider a company using a donut chart, similar to a pie chart to illustrate open job listings. This is an example of too much data in one chart. The resulting image is confusing and unclear as to the main point being made and what the viewer is intended to get from it. The outer ring is further broken into sub-categories with slices too narrow for labels, ensuring the viewer will be confused and the message unclear.
Like a pie chart, donut charts are better suited to showing percentages of two to four components with dramatic variations. But not as many as five or six, or a dozen. The point is that presenting data visually, using charts or graphs, requires the conscious decision to determine what type of visualization best represents the data in a comprehensive, coherent, and accurate way.
Presenting data to explain, to persuade, to inform, and enlighten means making it clear and understandable for your audience. Data visualizations are communications tools to make that easier.