Codecademy Logo

Tidy Data

Print Cheatsheet

Tidy Data Rules

A tidy dataset follows three fundamental rules:

  1. Each variable forms a column.
  2. Each observation forms a row.
  3. Each type of observational unit forms a table.

Below is an example of a tidy dataset:

ID# Student Year Class Grade
1 Brown 2020 Chem F
1 Brown 2021 Chem B
1 Brown 2021 Math A
2 Smith 2020 Bio C
2 Smith 2021 CompSci B
3 Saito 2020 Chem A
3 Saito 2021 Math B

Messy Data

Messy data is data that violates one of the tidy dataset rules (1. Each variable forms a column; 2. Each observation forms a row; 3. Each type of observational unit forms a table).

Below is an example of messy data:

ID# Name ChemGrade2020 MathGrade2020 BioGrade2020 CHemGrade2021 MathGrad2021 BioGrade21
1 Brown F B B C
B smith 100 95 65
3 Saito, K A 90 B 85