Just like we can use a scatter plot to examine the relationship between two numeric variables, we can use distribution plots to examine a numeric variable’s distribution of values.

#### Histograms

The most basic way to plot our data is to create a histogram. A histogram looks like a bar chart, but instead of having a bar for each category of a variable, it has a bar for sets of numeric values called **bins**. The height of the bar shows how many data points of the variable fall within that bin’s range of values.

We can create a histogram of `total_sales`

from our restaurant dataset `df`

using the seaborn function `sns.histplot()`

.

sns.histplot(data=df, x='sales_totals')

This code will display a histogram with vertical bars. Using `y`

instead of `x`

will create a histogram with horizontal bars.

Seaborn sets the `bins`

parameter to `auto`

by default, but we can change the binning of values in a number of ways.

**Number of bins:**an integer for the number of bins to fit the data to**Bin breaks:**a list of values for where bins should start and end**Reference rule:**the name of a method to compute the optimal bin width, including`auto`

(the larger of the`sturges`

and`fd`

reference rules)

Note that poorly chosen bin sizes can distort histograms, making it difficult to understand the histogram’s underlying data.

#### KDE plots

Another option for displaying a distribution is a kernel density estimation (KDE) plot. A KDE plot displays a continuous probability density curve for the distribution. This estimation looks a lot like a smoothed version of a histogram.

We can create a KDE plot of `total_sales`

using `kdeplot()`

. We can also set the optional parameter `fill`

to `True`

so that the plot will be shaded below the KDE curve.

sns.kdeplot(data=df, x='sales_totals', fill=True)

Like histograms, using `y`

instead of `x`

will create a horizontal orientation.

#### Box plots

Finally, let’s look at a plot that displays distributions for each category of a second variable. The box plot communicates specific information about each category’s distribution through a pattern of lines and a box, as shown in the following diagram:

*Note:* seaborn will create a horizontal box plot by default but will create a vertical box plot like the previous diagram if given the `y`

parameter instead of `x`

.

If we want to see a distribution of `total_sales`

for each `day`

of the week, we can use `sns.boxplot()`

as shown in the following code.

sns.boxplot(data=df, x='sales_totals', y='day')

Swapping the `x`

and `y`

parameters will change the orientation of the plot.

### Instructions

**1.**

Run all initial code cells. Then create a histogram of the municipal solid waste (`msw`

) of countries in the `waste`

dataset. Do not specify `bins`

parameter. The number of bins will be calculated automatically by seaborn.

**2.**

Let’s see how the shape of the previous histogram changes when we decrease the number of bins. Repeat the plot from question 1 but set the `bins`

parameter to 5.

**3.**

Now let’s see what the same distribution looks like when we use a KDE plot. Plot `msw`

in a KDE plot with shading below the curve.

**4.**

Let’s visualize more detailed information like the median, quartiles, and outliers. Create a box plot of `msw`

.

**5.**

Finally, let’s add a little more complexity to our box plot by displaying the `msw`

distributions of countries from different income levels. Repeat the plot from question 4 but add `income`

as the `y`

parameter.