Suppose that last week, the average amount of time spent per visitor to a website was 25
minutes. This week, the average amount of time spent per visitor to a website was 29
minutes. Did the average time spent per visitor change (i.e. was there a statistically significant bump in user time on the site)? Or is this just part of natural fluctuations?
One way of testing whether this difference is significant is by using a Two Sample T-Test. A Two Sample T-Test compares two sets of data, which are both approximately normally distributed.
The null hypothesis, in this case, is that the two distributions have the same mean.
You can use R’s t.test()
function to perform a Two Sample T-Test, as shown below:
results <- t.test(distribution_1, distribution_2)
When performing a Two Sample T-Test, t.test()
takes two distributions as arguments and returns, among other information, a p-value. Remember, the p-value let’s you know the probability that the difference in the means happened by chance (sampling error).
Instructions
We’ve created two distributions representing the time spent per visitor to BuyPie.com last week, week_1
, and the time spent per visitor to BuyPie.com this week, week_2
.
Find the means of these two distributions. Store them in week_1_mean
and week_2_mean
. View both means.
Find the standard deviations of these two distributions. Store them in week_1_sd
and week_2_sd
. View both standard deviations.
Run a Two Sample T-Test using the t.test()
function.
Save the results to a variable called results
and view it. Does the p-value make sense, knowing what you know about these datasets?