Suppose that last week, the average amount of time spent per visitor to a website was
25 minutes. This week, the average amount of time spent per visitor to a website was
29 minutes. Did the average time spent per visitor change (i.e. was there a statistically significant bump in user time on the site)? Or is this just part of natural fluctuations?
One way of testing whether this difference is significant is by using a Two Sample T-Test. A Two Sample T-Test compares two sets of data, which are both approximately normally distributed.
The null hypothesis, in this case, is that the two distributions have the same mean.
You can use R’s
t.test() function to perform a Two Sample T-Test, as shown below:
results <- t.test(distribution_1, distribution_2)
When performing a Two Sample T-Test,
t.test() takes two distributions as arguments and returns, among other information, a p-value. Remember, the p-value let’s you know the probability that the difference in the means happened by chance (sampling error).
We’ve created two distributions representing the time spent per visitor to BuyPie.com last week,
week_1, and the time spent per visitor to BuyPie.com this week,
Find the means of these two distributions. Store them in
week_2_mean. View both means.
Find the standard deviations of these two distributions. Store them in
week_2_sd. View both standard deviations.
Run a Two Sample T-Test using the
Save the results to a variable called
results and view it. Does the p-value make sense, knowing what you know about these datasets?