One of the best ways to quickly visualize the relationship between quantitative variables is to plot them against each other in a scatter plot. This makes it easy to look for patterns or trends in the data. Let’s start by plotting the area of a rental against it’s monthly price to see if we can spot any patterns.
plt.scatter(x = housing.price, y = housing.sqfeet) plt.xlabel('Rental Price (USD)') plt.ylabel('Area (Square Feet)') plt.show()
While there’s a lot of variation in the data, it seems like more expensive housing tends to come with slightly more space. This suggests an association between these two variables.
It’s important to note that different kinds of associations can lead to different patterns in a scatter plot. For example, the following plot shows the relationship between the age of a child in months and their weight in pounds. We can see that older children tend to weigh more but that the growth rate starts leveling off after 36 months:
If we don’t see any patterns in a scatter plot, we can probably guess that the variables are not associated. For example, a scatter plot like this would suggest no association:
The housing data has been saved for you as a dataframe named
housing in script.py. Create a scatter plot to see if there is an association between the area (
sqfeet) of a rental and the number of bedrooms (
beds). Do you think these variables are associated? If so, is the relationship what you expected?