Let’s say that we have performed an ANOVA to compare sales at the three VeryAnts stores. We calculated a p-value less than 0.05 and concluded that there is a significant difference between at least one pair of stores.

Now, we want to find out **which** pair of stores are different. This is where Tukey’s range test comes in handy!

In Python, we can perform Tukey’s range test using the `statsmodels`

function `pairwise_tukeyhsd()`

. For example, suppose we are again comparing video-game scores for math majors, writing majors, and psychology majors. We have a dataset named `data`

with two columns: `score`

and `major`

. We could run Tukey’s range test with a type I error rate of 0.05 as follows:

from statsmodels.stats.multicomp import pairwise_tukeyhsd tukey_results = pairwise_tukeyhsd(data.score, data.major, 0.05) print(tukey_results)

Output:

```
Multiple Comparison of Means - Tukey HSD,FWER=0.05
==========================================
group1 group2 meandiff lower upper reject
------------------------------------------
math psych 3.32 -0.11 6.74 False
math write 5.23 2.03 8.43 True
psych write -2.12 -5.25 1.01 False
------------------------------------------
```

Tukey’s range test is similar to running three separate 2-sample t-tests, except that it runs all of these tests simultaneously in order to preserve the type I error rate.

The function output is a table, with one row per pair-wise comparison. For every comparison where `reject`

is `True`

, we “reject the null hypothesis” and conclude there *is* a significant difference between those two groups. For example, in the output above, we would conclude that there is a significant difference between scores for math and writing majors, but no significant difference in scores for the other comparisons.

### Instructions

**1.**

The `veryants`

dataset is provided for you once again in `script.py`

. The `Store`

column represents the store that a sale was made at (`'A'`

, `'B'`

, or `'C'`

) and the `Sale`

column represents the cost of a sale in U.S.D.

Run Tukey’s range test with a type I error rate of `0.05`

to determine whether average sales are different at any pair of two stores and save the result as `tukey_results`

, then print it out.

**2.**

Inspect the output from the test you just ran. For which pairs of stores did you find a significant difference in average sales?

Assign the values of `a_b_significant`

, `a_c_significant`

`b_c_significant`

to `True`

if the test indicates a significant difference in sales at the indicated pair of stores and `False`

if not.

Recall that when we ran three t-tests, we found significant differences for the A vs. B and A vs. C comparisons. Do we get the same result with this test?