Log in from a computer to take this course

You'll need to log in from a computer to start Learn the Basics of Causal Inference with R. But you can practice or keep up your coding streak with the Codecademy Go app. Download the app to get started.

apple storegoogle store

If we knew both potential observations for every individual, we could use them to estimate several different statistics that summarize the effect of the treatment:

  • The individual treatment effect (ITE) is computed as Y1 - Y0. This statistic directly compares the two potential outcomes for each individual.
  • The Average Treatment Effect (ATE) is the average of all individual treatment effects, which can be calculated as the difference between the average of Y1 and the average of Y0.

Hold up! You may be wondering, “How are we supposed to calculate the individual treatment effect or true ATE if we can never observe the counterfactual outcome?” This question gets at the fundamental problem of causal inference. Causal inference is essentially a missing data problem: since we can only observe the outcome that actually happened, we are always missing the counterfactual outcome.


The table at the right shows data for 12 hospital patients who either interacted with a therapy animal (Z = 1) or did not interact with a therapy animal (Z = 0). Select “Theoretical” or “Reality” to toggle between the theoretical data (which is unobservable in real life) and what we could actually observe from these 12 patients in reality.

The theoretical data contains both potential outcomes. Because we have both potential outcomes, we can compute the true ATE by subtracting the potential outcome averages. The true ATE is -5.8, which means that interaction with therapy animals results in an average decrease in cortisol levels of 5.8 units.

The real data contains only one potential outcome for each individual. Notice now that the Y1 values are missing for individuals in the control group (Z = 0), and the Y0 values are missing for individuals in the treatment group (Z = 1). In real life, we will always be missing half of the data we need to calculate the true ATE.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?