As we saw in the last exercise, the structure of seaborn plotting functions is a little different from that of matplotlib. Seaborn works well with pandas DataFrames in long format. Most plotting functions in seaborn follow the same general structure:
import seaborn as sns sns.plot(data=df, x='col1', y='col2')
The main parameters are
datais an optional parameter for the name of the pandas DataFrame.
xis the column name for the x-axis of the plot.
yis the column name for y-axis of the plot.
Depending on the type of plot, we may need only
y, or both. We may also set other parameters to add more detail to the plot.
Also note that if we are working with arrays, we can enter the arrays in either the
data parameter or the
y parameters. The arrays can be named or entered directly into the function.
Let’s review some of the available plot options we have with seaborn.
- bar chart |
sns.barplot(): uses bar height to compare a measure between categorical variables
- scatter plot |
sns.scatterplot(): uses position to show the relationship, or correlation, between two numeric values
- line chart |
sns.lineplot(): shows continuous change, often used to measure change over time
- histogram |
sns.histplot(): shows how one kind of data is distributed
- KDE plot |
sns.kdeplot(): shows a distribution like a histogram but with smoothed lines
- box plot |
sns.boxplot(): shows specific information about a distribution, such as median and outliers
Seaborn doesn’t cover every plot type (for example, pie charts are not included), but it allows us to make many kinds of pre-formatted plots quickly. Because seaborn works well with pandas and long-format data, we can usually make styled plots with fewer lines of code and less preprocessing of the data as compared with matplotlib.
Let’s view some common plot types in seaborn. Run the initial code cells to load necessary libraries and data. Then run each code cell in the notebook to view an example plot for each chart type.
As you work through the notebook, check out the similarities and differences in the code structure of each plot type. Think about:
- Which plots take both
yparameters and which take only
- What additional parameters do some plots take?
- What do you think those extra parameters do to the plot?
Your work here is not graded, so feel free to explore and play with the code before selecting
Next to move to the next exercise.