Key Concepts

Review core concepts you need to learn to master this subject

Symmetric Distribution in Histogram

In a histogram, the distribution of the data is symmetric if it has one prominent peak and equal tails to the left and the right. The Median and the Mean of a symmetric dataset are similar.

Right-skewed Dataset

In a histogram, if the prominent peak lies to the left with the tail extending to the right, then it is called a right-skewed dataset. In this case, the median is less than the mean of the dataset.

Left-Skewed Dataset

A left-skewed dataset has a long left tail with one prominent peak to the right. The median of this dataset is greater than the mean of this dataset.

Unimodal Distribution

Modality describes the number of peaks in a dataset. A unimodal distribution in a histogram means there is one distinct peak indicating the most frequent value in a histogram.

Bimodal Dataset

A bimodal dataset has two distinct peaks. This typically happens when the dataset contains two different populations.

Multimodal Dataset

If a histogram has more than two peaks, then the dataset is referred to as multimodal.

Uniform Dataset

A uniform dataset does not have any distinct peaks.

As seen in the histogram below, uniform datasets have approximately the same number of values in each group represented by a bar - there is no obvious clustering.

Peak of Unimodal Distribution

The center of a dataset is the peak of a unimodal distribution. The statistics that describe the center of a dataset are the mean and median.

Spread of a Dataset

The spread of a dataset is the dispersion from the dataset’s center. The descriptive statistics that describe the spread are range, variance and standard deviation.

For example, for the dataset [1, 4, 7, 10], the range of the dataset would be the maximum value of the set - the minimum value of the set, or 10 - 1 = 9.

Dataset Outliers

An outlier is a data point that differs significantly from the rest of the values in a dataset.

For example, in the dataset [1, 2, 3, 4, 100] the value 100 is an outlier because it lies a large distance from the rest of the data.

Describe a Histogram
Lesson 1 of 1
  1. 1
    At this point, you should be familiar with what a histogram displays. If you are not, take a few minutes to complete our lesson on histograms . In this lesson, we’re going to build on those skill…
  2. 2
    One of the most common ways to summarize a dataset is to communicate its center. In this lesson, we will use average and median as our measures of centrality. Take the Codecademy lessons on averag…
  3. 3
    Once you’ve found the center of your data, you can shift to identifying the extremes of your dataset: the minimum and maximum values. These values, taken with the mean and median, begin to indicate…
  4. 4
    Once you have the center and range of your data, you can begin to describe its shape. The skew of a dataset is a description of the data’s symmetry. A dataset with one prominent peak, and similar …
  5. 5
    The modality describes the number of peaks in a dataset. Thus far, we have only looked at datasets with one distinct peak, known as unimodal. This is the most common. ![histogram](https://s3.ama…
  6. 6
    An outlier is a data point that is far away from the rest of the dataset. Ouliers do not have a formal definition, but are easy to determine by looking at histogram. The histogram below shows an ex…
  7. 7
    In this lesson, you learned a framework for describing the distribution of a dataset, which includes the following five features: - Center - Spread - Skew - Modality - Outliers If you’re curious,…

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo