Learn

It’s important to note that there are some limitations to using correlation or covariance as a way of assessing whether there is an association between two variables. Because correlation and covariance both measure the strength of linear relationships with non-zero slopes, but not other kinds of relationships, correlation can be misleading.

For example, the four scatter plots below all show pairs of variables with near-zero correlations. The bottom left image shows an example of a perfect linear association where the slope is zero (the line is horizontal). Meanwhile, the other three plots show non-linear relationships — if we drew a line through any of these sets of points, that line would need to be curved, not straight! ### Instructions

1.

A simulated dataset named `sleep` has been loaded for you in script.py. The hypothetical data contains two columns:

• `hours_sleep`: the number of hours that a person slept
• `performance`: that person’s performance score on a physical task the next day

Create a scatter plot of `hours_sleep` (on the x-axis) and `performance` (on the y-axis). What is the relationship between these variables?

2.

Calculate the correlation for `hours_sleep` and `performance` and save the result as `corr_sleep_performance`. Then, print out `corr_sleep_performance`. Does the correlation accurately reflect the strength of the relationship between these variables?