Key Concepts

Review core concepts you need to learn to master this subject

Hypothesis Test Errors

Type I errors, also known as false positives, is the error of rejecting a null hypothesis when it is actually true. This can be viewed as a miss being registered as a hit. The acceptable rate of this type of error is called significance level and is usually set to be 0.05 (5%) or 0.01 (1%).

Type II errors, also known as false negatives, is the error of not rejecting a null hypothesis when the alternative hypothesis is the true. This can be viewed as a hit being registered as a miss.

Depending on the purpose of testing, testers decide which type of error to be concerned. But, usually type I error is more important than type II.

Sample Vs. Population Mean

In statistics, we often use the mean of a sample to estimate or infer the mean of the broader population from which the sample was taken. In other words, the sample mean is an estimation of the population mean.

Central Limit Theorem

The central limit theorem states that as samples of larger size are collected from a population, the distribution of sample means approaches a normal distribution with the same mean as the population. No matter the distribution of the population (uniform, binomial, etc), the sampling distribution of the mean will approximate a normal distribution and its mean is the same as the population mean.

The central limit theorem allows us to perform tests, make inferences, and solve problems using the normal distribution, even when the population is not normally distributed.

Hypothesis Test P-value

Statistical hypothesis tests return a p-value, which indicates the probability that the null hypothesis of a test is true. If the p-value is less than or equal to the significance level, then the null hypothesis is rejected in favor of the alternative hypothesis. And, if the p-value is greater than the significance level, then the null hypothesis is not rejected.

Hypothesis Testing with R
Lesson 1 of 1
  1. 1
    Say you work for a major social media website. Your boss comes two you with two questions: does the demographic of users on your site match the company’s expectation? did the new interface upda…
  2. 2
    Suppose you want to know the average height of an oak tree in your local park. On Monday, you measure 10 trees and get an average height of 32 ft. On Tuesday, you measure 12 different trees and rea…
  3. 3
    In the previous exercise, the sample means you calculated closely approximated the population mean. This won’t always be the case! Consider a tailor of school uniforms at a school for students age…
  4. 4
    You begin the statistical hypothesis testing process by defining a hypothesis, or an assumption about your population that you want to test. A hypothesis can be written in words, but can also be …
  5. 5
    Suppose you want to know if students who study history are more interested in volleyball than students who study chemistry. Before doing anything else to answer your original question, you come up …
  6. 6
    When using automated processes to make decisions, you need to be aware of how this automation can lead to mistakes. Computer programs can be as fallible as the humans who design them. Because of th…
  7. 7
    You know that a hypothesis test is used to determine the validity of a null hypothesis. Once again, the null hypothesis states that there is no actual difference between the two populations of data…
  8. 8
    While a hypothesis test will return a p-value indicating a level of confidence in the null hypothesis, it does not definitively claim whether you should reject the null hypothesis. To make this dec…
  9. 9
    Consider the fictional business BuyPie, which sends ingredients for pies to your household so that you can make them from scratch. Suppose that a product manager hypothesizes the average age of vis…
  10. 10
    Suppose that last week, the average amount of time spent per visitor to a website was 25 minutes. This week, the average amount of time spent per visitor to a website was 29 minutes. Did the averag…
  11. 11
    Suppose that you own a chain of stores that sell ants, called VeryAnts. There are three different locations: A, B, and C. You want to know if the average ant sales over the past year are significan…
  12. 12
    In the last exercise, you saw that the probability of making a Type I error got dangerously high as you performed more t-tests. When comparing more than two numerical datasets, the best way to pre…
  13. 13
    Before you use numerical hypothesis tests, you need to be sure that the following things are true: #### 1. The samples should each be normally distributed…ish Data analysts in the real world of…
  14. 14
    Phew! Nobody said hypothesis testing is easy, but you made it to the end of the lesson. Congratulations! The world of hypothesis testing is vast. There is much more you can learn, and so many appli…

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo