Learn

For quantitative variables, we often want to describe the central tendency, or the “typical” value of a variable. For example, what is the typical cost of rent in New York City?

There are several common measures of central tendency:

• Mean: The average value of the variable, calculated as the sum of all values divided by the number of values.
• Median: The middle value of the variable when sorted.
• Mode: The most frequent value of the variable.
• Trimmed mean: The mean excluding x percent of the lowest and highest data points.

For our `rentals` DataFrame with a column named `rent` that contains rental prices, we can calculate the central tendency statistics listed above as follows:

``````# Mean
rentals.rent.mean()

# Median
rentals.rent.median()

# Mode
rentals.rent.mode()

# Trimmed mean
from scipy.stats import trim_mean
trim_mean(rentals.rent, proportiontocut=0.1)  # trim extreme 10%``````

### Instructions

1.

Using the same `movies` DataFrame from the last exercise, find the mean `production_budget` for all movies and save it to a variable called `mean_budget`. Print `mean_budget` to see the result.

2.

Save the median budget to a variable called `med_budget` and print the result.

3.

Save the mode to a variable called `mode_budget` and print the result.

How do the mean, median, and mode of movie budgets compare to each other?

4.

Find the mean of the budget after removing 20% of the lowest and highest data points. Save the trimmed mean to a variable called `trmean_budget` and print the result.

How does trimming the most extreme data points affect the mean budget?