You’ve completed the Data Visualization in R lesson! You now know how to choose and implement different kinds of geoms in ggplot2
, how to customize your plot axes, and how to visualize additional variables through facets.
Below is a summary of the key concepts you learned – great job!
How to create different geoms and when to use which type:
- Histograms can be created using
geom_histogram()
to show the distribution of a continuous variable. - Heatmaps can be created using
geom_bin2d()
to show the distribution of the intersections of two continuous variables. - Box-and-whisker plots can be created using
geom_boxplot()
to show the distribution of a continuous variable by quantiles, e.g. 25th, 50th, and 75th percentiles. - Bar plots can be created using
geom_bar()
, which shows the count of observations for different values of a discrete variable by default.geom_col()
will create bar plots showing the value of the variable on they
axis rather than counts. - Using the
position
argument and afill
aesthetic mapping, we can create stacked bar plots (position = "stack"
), stacked bar plots showing ratios (position = "fill"
), and clustered bar plots (position = "dodge"
).
How to show different statistics in our data:
- The
stat
argument allows us to display different kinds of values. stat = "identity"
will show they
axis variable values on a bar plot as is, rather than displaying thex
axis value counts.stat = "summary"
combined with a function supplied infun
will display bar heights based on the summary function. For example,stat = "summary", fun = "mean"
will calculate and display means.
How to add error bars to bar plots to show variance around a mean:
geom_error()
creates error bars on bar plots when providedymin
andymax
variables representing the upper and lower bounds of error ranges.
How to customize discrete and continuous axes:
- We can customize discrete axes using
scale_x_discrete()
andscale_y_discrete()
. - We can customize continuous axes using
scale_x_continuous()
and scale_y_continuous()`. - We can zoom in on a region of our data using
coord_cartesian()
.
How to show additional variables in panels of a grid using facets:
- By adding
facet_grid()
, we can map up to two additional variables along facet columns and rows.
Instructions
The code included to the right creates the plot shown in the very first exercise, depicting how different animals spend the hours of their day. Feel free to experiment with this plot and modify it further!