Let’s imagine we are working for the city government of the fictional city of Melody Metropolis. The mayor of Melody Metropolis wants to know more about the musicians who currently live in the city. The learning environment shows a dataset we have on musicians living in the city as of last year. How would you describe this dataset? See if you can answer any of the following questions:

  • What does a typical musician’s income look like?
  • Is there a wide range of musician ages?
  • What proportion of the musicians in the dataset play guitar?

We can try to make generalizations by looking over the rows and columns, but it’s difficult to answer these questions precisely. We need some kind of “data vocabulary” that can help us measure and describe the variables in the dataset. Summary statistics can be used for exactly this purpose!

With a basic understanding of summary statistics, we can communicate and understand a lot more specific information about the musicians in the city. But learning statistics is often associated with a lot of negativity:

  • Memorization of lots of math formulas
  • Long calculations done by hand
  • Confusing or meaningless interpretations

None of these struggles need to be part of learning to use statistics. In this lesson, we’ll gain a conceptual understanding of how summary statistics can easily help us communicate and interpret our dataset.


Before moving to the next exercise, familiarize yourself with the following names and descriptions of the variables in the dataset:

  • age: age in years
  • income: yearly income in US dollars
  • title: primary job title
  • experience: years of experience in the field of music
  • instrument: primary instrument
  • band: whether in a band (1 = yes, 0 = no)

What are you interested in learning about the musicians of Melody Metropolis?

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?