Learn

Let’s say we have some data on student wages at the state level in a dataset called `wages`. It has the following variables:

• `state`: state where the universities are located
• `year`: year the data is from
• `avg_wage`: average student wage for all public universities in the state

Start by using the package ggplot2 in R to re-create the student wage line plot from the last exercise. Since we only want to plot California schools, filter by the `state` variable and then add code for our plot.

``````# import libraries
library(dplyr)
library(ggplot2)
# plot wages versus years
ca_wages <- wages %>%
#only California schools
filter(state == "California") %>%
#wages over time
ggplot(aes(x = year, y = avg_wage)) +
#line plot
geom_line() ``````

The situation created by the law is a natural experiment. Rather than having a researcher randomly assign treatment and control groups to study minimum wage effects, treatment assignment is decided by some outside force. In this case, that outside force was the minimum wage law that went into effect in 2017.

Let’s add a dashed vertical line to our plot to separate the time before and after the law went into effect. We’ll also label the x-axis scale to see the years more clearly.

``````ca_wages +
geom_vline(xintercept = 2016, linetype = "dashed") +
scale_x_continuous(breaks = c(2007:2017))`````` We can see that there is some change after the law is implemented in 2016, but we don’t know if the change is due to the law or because of other conditions that happened at the same time. We need some data to use as a counterfactual to this situation: what student wages in California would have looked like if the law never happened.

### Instructions

1.

Let’s say there was a new entertainment tax in Sydney starting in 2019. You want to find out if the tax affected movie theater ticket sales. You have data about average annual movie theater ticket sales in Sydney from 2012 through 2019 with the following variables:

• `city`: city where the universities are located
• `year`: year the data is from
• `sales`: average ticket sales for theaters in the city

This data is contained in the dataset `tickets` which has been loaded for you in notebook.Rmd with the first few rows printed for you in the workspace.

Make a line plot that shows average movie tickets for Sydney by year. Remember to filter `city` to look at just Sydney. Save the plot as `syd_sales`.

2.

Add to `syd_sales` a dashed vertical line at x=2018 and x-axis scale labels for the years 2012 to 2019. What happened to ticket sales in the year the tax was implemented?