You’ve completed the Data Visualization in R lesson! You now know how to choose and implement different kinds of geoms in
ggplot2, how to customize your plot axes, and how to visualize additional variables through facets.
Below is a summary of the key concepts you learned – great job!
How to create different geoms and when to use which type:
- Histograms can be created using
geom_histogram()to show the distribution of a continuous variable.
- Heatmaps can be created using
geom_bin2d()to show the distribution of the intersections of two continuous variables.
- Box-and-whisker plots can be created using
geom_boxplot()to show the distribution of a continuous variable by quantiles, e.g. 25th, 50th, and 75th percentiles.
- Bar plots can be created using
geom_bar(), which shows the count of observations for different values of a discrete variable by default.
geom_col()will create bar plots showing the value of the variable on the
yaxis rather than counts.
- Using the
positionargument and a
fillaesthetic mapping, we can create stacked bar plots (
position = "stack"), stacked bar plots showing ratios (
position = "fill"), and clustered bar plots (
position = "dodge").
How to show different statistics in our data:
statargument allows us to display different kinds of values.
stat = "identity"will show the
yaxis variable values on a bar plot as is, rather than displaying the
xaxis value counts.
stat = "summary"combined with a function supplied in
funwill display bar heights based on the summary function. For example,
stat = "summary", fun = "mean"will calculate and display means.
How to add error bars to bar plots to show variance around a mean:
geom_error()creates error bars on bar plots when provided
ymaxvariables representing the upper and lower bounds of error ranges.
How to customize discrete and continuous axes:
- We can customize discrete axes using
- We can customize continuous axes using
- We can zoom in on a region of our data using
How to show additional variables in panels of a grid using facets:
- By adding
facet_grid(), we can map up to two additional variables along facet columns and rows.
The code included to the right creates the plot shown in the very first exercise, depicting how different animals spend the hours of their day. Feel free to experiment with this plot and modify it further!