Codecademy Logo

Visualizing Data for Impact: Introduction to Data Visualization

The Purpose of Data Visualizations

Data visualizations allow us to understand data through visual exploration and communicate insights to others in a compelling way.

Visualizations can be used for exploratory data analysis to uncover patterns and relationships and for crafting truthful and persuasive data-driven arguments when presenting to an audience.

Visualizing Relationships in Data

Scatterplots, bubble charts, and line charts are effective tools for visualizing relationships between variables in a dataset. Scatterplots show the relationship between two continuous variables, bubble charts add a third dimension with bubble size, and line charts illustrate trends or changes in a variable over a continuous scale like time.

A scatterplot depicting healthcare costs from 0 to $60,000+ on the y-axis and age from 18 to 65 on the x-axis. The points are colored by smoker status. There is a band of green non-smoker dots near the bottom of the graph, rising from about $5,000 at age 18 to about $15,000 by age 65. A band of dark red smoker dots is nearer the top of the graph, rising from about $35,000 to $50,000. In between is a mix of both colors of dots, still in a fairly defined band.

Comparing Values Across Categories

Bar charts, pie charts, and line charts help evaluate and compare values between different categories. Bar charts allow easy comparison of values across categories, pie charts show the proportion or percentage of each category, and line charts can compare values or trends over a categorical scale like time or regions.

A stacked bar chart shows the breakdown by insurance coverage type for each state in the US. It is arranged in ascending order by Uninsured percentage, with Massachusetts, DC, and Hawaii lowest, and Texas, Oklahoma, and Georgia with the highest Uninsured population percentages.

Visualizing Distribution and Composition

Distributions help us understand the spread and composition of numeric data. Histograms and area charts reveal insights into where data points fall, their skewness, and concentration.

Boxplots and violin plots show the distribution shape, while heatmaps visualize the density of a phenomenon across dimensions. When creating heatmaps, use high-contrast colors for accessibility.

Visualizations for Your Audience

When creating data visualizations, consider your audience’s cognitive load - the difficulty they face in processing new information. Cognitive load is influenced by the complexity of the information, the audience’s background knowledge, and any distracting elements in the presentation.

To reduce cognitive load and effectively communicate your message, seek feedback, use annotations, descriptive titles, and captions to provide context and minimize unnecessary design elements.

A diagram of cognitive load. A brain is divided into three sections, with labels: 1. Intrinsic Load: How complex is the information? 2. Germane Load: How much background information does the audience have? and 3. Extraneous Load: What design choices can I make to streamline the sharing of information?

Leveraging AI for Data Visualizations

Generative AI can help fill gaps in coding knowledge and increase efficiency when creating data visualizations. However, critical thinking and understanding real-world context are essential for creating meaningful visualizations, which LLMs cannot replace.

Use LLMs to explore new languages as coding buddies, but learn data visualization and design theory first. Engage in conversation with the LLM, iterating based on outputs, but maintain data privacy and check for plagiarism.

Learn more on Codecademy