Whenever we run a hypothesis test using a significance threshold, we expose ourselves to making two different kinds of mistakes: *type I errors* (false positives) and *type II errors* (false negatives):

Null hypothesis: |
is true |
is false |
---|---|---|

P-value significant | Type I Error | Correct! |

P-value not significant | Correct! | Type II error |

Consider the quiz question hypothesis test described in the previous exercises:

- Null: The probability that a learner answers a question correctly is 70%.
- Alternative: The probability that a learner answers a question correctly is not 70%.

Suppose, for a moment, that the true probability of a learner answering the question correctly **is** 70% (if we showed the question to ALL learners, exactly 70% would answer it correctly). This puts us in the first column of the table above (the null hypothesis “is true”). If we run a test and calculate a significant p-value, we will make type I error (also called a false positive because the p-value is falsely significant), leading us to remove the question when we don’t need to.

On the other hand, if the true probability of getting the question correct **is not** 70%, the null hypothesis “is false” (the right-most column of our table). If we run a test and calculate a non-significant p-value, we make a type II error, leading us to leave the question on our site when we should have taken it down.

### Instructions

**1.**

Suppose that the average score on a standardized test is 50 points. A researcher wants to know whether students who take this test in an ergonomically designed chair score significantly differently from the general population of test-takers. The researcher randomly assigns 100 students to take the test in an ergonomic chair. Then, the researcher runs a hypothesis test with a significance threshold of 0.05 and the following null and alternative hypotheses:

- Null: The mean score for students who take the test in an ergonomic chair
**is**50 points. - Alternative: The mean score for students who take the test in an ergonomic chair
**is not**50 points.

Suppose that the truth (which the researcher doesn’t know) is: if every student took the test in an ergonomic chair, the average score for all test-takers would be 52 points.

Based on their sample of only 100 students, the researcher calculates a p-value of 0.07. In **script.py**, change the value of `outcome`

to:

`'correct'`

if the researcher will come to the*correct*conclusion based on this test`'type one'`

if the researcher will make a*type I error*based on this test`'type two'`

if the researcher will make a*type II error*based on this test