When we first look at a dataset, we want to be able to quickly understand certain things about it:
- Do some values occur more often than others?
- What is the range of the dataset (i.e., the min and the max values)?
- Are there a lot of outliers?
We can visualize this information using a chart called a histogram.
For instance, suppose that we have the following dataset:
d = [1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 5]
A simple histogram might show us how many 1’s, 2’s, 3’s, etc. we have in this dataset.
|Value||Number of Samples|
When graphed, our histogram would look like this:
Look at the histogram to the right. How many values in the data set are equal to either 5 or 6?
Save the amount to the variable