We’ve gone over each of the basic units in the grammar of graphics: data, geometries, and aesthetics. Let’s extend this new knowledge to create a new type of plot: the bar chart. Bar charts are great for showing the distribution of categorical data. Typically, one of the axes on a bar chart will have numerical values and the other will have the names of the different categories you wish to understand.
Let’s build a bar chart by using some of the R built-in datasets. These are data frames that you can readily access in your code to explore and create visualizations. They are handy because these built-in datasets usually include nicely distributed categorical data.
geom_bar() layer adds a bar chart to the canvas. Typically when creating a bar chart, you assign an
aes() aesthetic mapping with a single categorical value on the
x axes and the
aes() function will compute the count for each category and display the count values on the
Since we’re extending the grammar of graphics, let’s also learn about how to save our visuals as local image files.
The following code maps the count of each category in the Language column in a dataset of 100 popular books to a bar length and then saves the visualization as a .png file named
bar <- ggplot(books, aes(x=Language)) + geom_bar() bar ggsave("bar-example.png")
ggsave() function allows you to save visualizations as a local file with the name of your choice. It’s a useful function when developing visualizations locally.
The code above outputs the following plot:
mpg dataset in R is a built-in dataset describing fuel economy data from 1999 and 2008 for 38 popular models of cars and is included with
Inspect the built-in dataset
mpg by printing its
head(). Take special note of the
class column which describes vehicle class for the cars with a total of 7 types (compact, SUV, minivan etc.)
Create a variable
bar that is equal to a
ggplot() object with the
mpg built-in dataset associated as its
We want to understand the breakdown of the types of vehicles in the dataset, so provide the canvas, or the
ggplot() object with an aesthetic mapping
aes() that makes the
x axis represent the categorical values of the
class column in the dataframe. ggplot2 will count each unique value in the
class column and automagically designate that value to the
geom_bar() layer to
bar. Be sure to type
bar after you’ve declared the variable and added the layer so that the plot can render in your R notebook output.
Let’s add some color to the bar chart, by adding an
aes() aesthetic mapping to the
geom_bar() layer that
fills the color of each bar based on the
Our plot could use some context, let’s add a title and a sub-title so that users can understand more about what we are displaying with this bar chart and the
labs() function to assign a new
title that describes this plot is illustrating the
Types of Vehicles and a
subtitle describing the data as
From fuel economy data for popular car models (1999-2008)