Before we use a Chi-Square test, we need to be sure that the following things are true:

#### 1. The observations should be independently randomly sampled from the population

This is also true of 2-sample t-tests, ANOVA, and Tukey. The purpose of this assumption is to ensure that the sample is representative of the population of interest.

#### 2. The categories of both variables must be mutually exclusive

In other words, individual observations should only fall into one category per variable. This means that categorical variables like “college major”, where students can have multiple different college majors, would not be appropriate for a Chi-Square test.

#### 3. The groups should be independent

Similar to 2-sample t-tests, ANOVA, and Tukey, a Chi-Square test also shouldn’t be used if either of the categorical variables splits observations into groups that can influence one another. For example, a Chi-Square test would **not** be appropriate if one of the variables represents three different time points.

### Instructions

**1.**

Researchers are running a study to test a new vaccine for Covid-19 in adults. A sample of 1000 adults (you can assume that they are randomly sampled adults, or at least representative of the population) are randomly split into two groups: half get a vaccine, while the other half get a placebo. Everyone is monitored for six months to see if they develop symptoms of Covid-19. The first few rows of the resulting dataset looks like this:

Group | Outcome |
---|---|

vaccine | not sick |

vaccine | not sick |

placebo | sick |

placebo | not sick |

The researchers want to use this data to determine whether their vaccine will be effective at preventing illness in the general population of adults (eg., is whether or not someone got a vaccine associated with whether or not they got sick?).

Is a Chi-Square test appropriate to address this question? Change the value of `checkpoint_1`

to `True`

if a Chi-Square test is appropriate and `False`

if it is not.

**2.**

Researchers are interested in studying the effect of a 10 minute yoga regimen on self-reported mood in adults. In order to test this, a representative sample of 1000 adults are asked to complete a survey where they rate their current happiness level as “very low”, “low”, “neutral”, “high”, or “very high”. Each person then completes a 10 minute yoga regimen, then responds to the same survey once again. The first few rows of data from this study look like this:

Person ID | Time | Happiness |
---|---|---|

1 | before yoga | low |

1 | after yoga | neutral |

2 | before yoga | neutral |

2 | after yoga | high |

The researchers want to know if 10 minutes of yoga can help improve self-reported mood for adults in the general population (eg., is whether or not someone has just completed 10 minutes of yoga associated with their self-reported happiness level?).

Is a Chi-Square test appropriate to address this question? Change the value of `checkpoint_2`

to `True`

if a Chi-Square test is appropriate and `False`

if it is not.