Learn

The spread of a quantitative variable describes the amount of variability. This is important because it provides context for measures of central tendency. For example, if there is a lot of variability in New York City rent prices, we can be less certain that the mean or median price is representative of what the typical rent is.

There are several common measures of spread:

  • Range: The difference between the maximum and minimum values of a variable.
  • Interquartile range (IQR): The difference between the 75th and 25th percentile values.
  • Variance: The average of the squared distance from each data point to the mean.
  • Standard deviation (SD): The square root of the variance.
  • Mean absolute deviation (MAD): The mean absolute value of the distance between each data point and the mean.

For our rentals DataFrame, we can calculate the spread for the rent column as follows:

# Range rentals.rent.max() - rentals.rent.min() # Interquartile range rentals.rent.quantile(0.75) - rentals.rent.quantile(0.25) from scipy.stats import iqr iqr(rentals.rent) # alternative way # Variance rentals.rent.var() # Standard deviation rentals.rent.std() # Mean absolute deviation rentals.rent.mad()

Instructions

1.

Using the movies DataFrame, find the range for production_budget and save it to a variable called range_budget. Print range_budget to see the result.

2.

Save the interquartile range for budget to a variable called iqr_budget and print the result.

3.

Save the variance to a variable called var_budget and print the result.

4.

Save the standard deviation to a variable called std_budget and print the result.

5.

Save the mean absolute deviation to a variable called mad_budget and print the result.

Sign up to start coding

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?