Learn

For quantitative variables, we often want to describe the central tendency, or the “typical” value of a variable. For example, what is the typical cost of rent in New York City?

There are several common measures of central tendency:

  • Mean: The average value of the variable, calculated as the sum of all values divided by the number of values.
  • Median: The middle value of the variable when sorted.
  • Mode: The most frequent value of the variable.
  • Trimmed mean: The mean excluding x percent of the lowest and highest data points.

For our rentals DataFrame with a column named rent that contains rental prices, we can calculate the central tendency statistics listed above as follows:

# Mean rentals.rent.mean() # Median rentals.rent.median() # Mode rentals.rent.mode() # Trimmed mean from scipy.stats import trim_mean trim_mean(rentals.rent, proportiontocut=0.1) # trim extreme 10%

Instructions

1.

Using the same movies DataFrame from the last exercise, find the mean production_budget for all movies and save it to a variable called mean_budget. Print mean_budget to see the result.

2.

Save the median budget to a variable called med_budget and print the result.

3.

Save the mode to a variable called mode_budget and print the result.

How do the mean, median, and mode of movie budgets compare to each other?

4.

Find the mean of the budget after removing 20% of the lowest and highest data points. Save the trimmed mean to a variable called trmean_budget and print the result.

How does trimming the most extreme data points affect the mean budget?

Sign up to start coding

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.
Already have an account?