Learn

When we talked about Measures (continuous, numerical data), we touched on aggregates. Let’s look more into what aggregates are and how we can work with them in Tableau.

The most common aggregates are SUM(), AVG(), MEDIAN(), COUNT(), and COUNTD(). Check out examples here and in the table to the right. (You can zoom in on the graphs to compare axis values.)

  • SUM() adds all the values in a field.

    SUM(Trunk Diameter) = 7,712,983 inches

  • AVG() takes the average, or mean, of all values in a field.

    AVG(Trunk Diameter) = 11.28 inches

  • MEDIAN() takes the median, or middle value, of all values in a field.

    MEDIAN(Trunk Diameter) = 9.0 inches

  • COUNT() returns the number of items in a field.

    COUNT(Trunk Diameter) = 683,788 trees with trunk diameter measurements

  • COUNTD() returns the number of distinct or unique items in a field.

    COUNTD(Trunk Diameter) = 146 different measurements of trunk diameters.

What if we wanted to ask, “which zip code has trees with the biggest trunks?” We can drag the Zipcode pill onto the Rows Shelf, and Tableau will automatically show us the sum of trunk diameters for all trees by Zipcode.

Top Zipcodes Count

This doesn’t answer our question, though, because summing the diameters will be impacted by how many trees are in a zip code, not just how big each trunk is. A better aggregation, in this case, would be the Average since that divides the diameter sum by the number of trees. We can see which zip code has the highest average diameters by changing the default aggregate.

Here are the top zip codes by average diameter.

Top Zipcode Average

This brings up another important aggregate that we can make. We hypothesized that if the sum of all tree diameters by zip code were sorted that the top zip code would likely be the zip code with the most trees. We can check that with the COUNT() aggregate. This counts each record of a given field and even more helpful will be the COUNTD() aggregate. The D indicates a distinct count of the given field. We will try this by returning a distinct count of Tree ID by zip code.

Sum Average COUNTD

Instructions

  1. Let’s create some of these charts to understand aggregates better. Open Tableau and connect to your tree census data.
  2. Drag the Trunk Diameter field to your Rows Shelf. (You can also double click the field name and a pill will be created for you on the Rows Shelf.)
  3. Pull Borough Name to Columns and click the Swap Rows and Columns button. Swap Icon
  4. Drag a new Borough Name pill to the Color Shelf. (Hold CTRL while dragging to make a copy of a pill and from the Rows Shelf. Command+drag on a Mac.) Your view should look like this so far: Borough by Sum
  5. You’ll notice that the aggregation is still using SUM(Trunk Diameter), as this is Tableau’s default, but we know that it isn’t accurate for finding the borough with the largest trunks. Let’s switch that to AVG(Trunk Diameter). Hover on the pill and click the dropdown arrow to open the pill’s menu. Select Average. Switch to Average
  6. CTRL or Command+drag the AVG(Trunk Diameter) pill to the Label shelf.
  7. Sort Borough Name by field in Descending order with the aggregation of Average. Sort Descending
  8. Now we see that Queens has the highest average trunk size in the city with 12.6 inches in diameter. Avg Trunk Sorted Queens also has the greatest SUM() of diameters, but by a much wider margin than when compared by average. Borough by Sum Avg COUNTD With COUNTD() of TreeID included, we can see that Queens has about 11% more trees than Brooklyn, which accounts for the much larger share of the Sum in the top chart. While the order is the same with each aggregation (this isn’t always the case!), we can understand tree trunk size in NYC better by comparing multiple aggregates.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?