It’s common to have some amount of uncertainty or imprecision in measurements, particularly in empirical measurements collected by observation. Frequently, we quantify this uncertainty through confidence intervals, which account for the spread and number of data points included in the computation of our summary statistic.
We can visually represent this uncertainty by adding error bars to a graph. We’ll use matplotlib’s general function
plt.errorbar(), which takes parameters for…
y: restate the X and Y values of the underlying graph
xerr: set error values in the X or Y direction
color: set the color of the error bar (optional)
fmt: change the marker
yerr can be added as a column from the same dataframe that contains
y data, or from a separate array. Error values are calculated using statistics – exactly how it’s done is outside the scope of this course, so we’ll focus on implementing given error values. Say our local business profit dataset has a column of called
error_value that gives an over/under amount for each business’ profit. We can use that information to add error bars to the Profits graph like so:
## make the bar graph plt.bar(x = data.business_name, height = data.profit, width = 0.8, align = 'center') plt.title("Profits at 5 Businesses, 2021") plt.xlabel("Business Name") plt.ylabel("Profit ($)") ## add the error bars plt.errorbar(x = data.business_name, y = data.profit, yerr = data.error_value, fmt='o', color='purple')
This will produce a bar graph with purple error bars that have circle markers. If instead, we added error bars from a separate array, the code would look like this:
## make the bar graph plt.bar(x = data.business_name, height = data.profit, width = 0.8, align = 'center') plt.title("Profits at 5 Businesses, 2021") plt.xlabel("Business Name") plt.ylabel("Profit ($)") ## define the error array (one error measure for each business) error_bars = [15, 35, 70, 25, 30] ## add the error bars from the array plt.errorbar(x = data.business_name, y = data.profit, yerr = error_bars, fmt='o', color='purple')
In the Jupyter notebook, let’s add error bars to the bar chart we made in the last exercise.
Run the Setup cells to load in the necessary packages and datasets. Run the cell below to see the
bar_data dataset again. We’ll use the
error column in this exercise.
In the space above
plt.show(), write the code to add error bars using the
error column. Set the marker to
'o' and make the error bar color