Learn

When we talked about Measures (continuous, numerical data), we touched on aggregates. Let’s look more into what aggregates are and how we can work with them in Tableau.

The most common aggregates are `SUM()`, `AVG()`, `MEDIAN()`, `COUNT()`, and `COUNTD()`. Check out examples here and in the table to the right. (You can zoom in on the graphs to compare axis values.)

• `SUM()` adds all the values in a field.

`SUM(Trunk Diameter)` = 7,712,983 inches

• `AVG()` takes the average, or mean, of all values in a field.

`AVG(Trunk Diameter)` = 11.28 inches

• `MEDIAN()` takes the median, or middle value, of all values in a field.

`MEDIAN(Trunk Diameter)` = 9.0 inches

• `COUNT()` returns the number of items in a field.

`COUNT(Trunk Diameter)` = 683,788 trees with trunk diameter measurements

• `COUNTD()` returns the number of distinct or unique items in a field.

`COUNTD(Trunk Diameter)` = 146 different measurements of trunk diameters.

What if we wanted to ask, “which zip code has trees with the biggest trunks?” We can drag the `Zipcode` pill onto the Rows Shelf, and Tableau will automatically show us the sum of trunk diameters for all trees by `Zipcode`.

This doesn’t answer our question, though, because summing the diameters will be impacted by how many trees are in a zip code, not just how big each trunk is. A better aggregation, in this case, would be the Average since that divides the diameter sum by the number of trees. We can see which zip code has the highest average diameters by changing the default aggregate.

Here are the top zip codes by average diameter.

This brings up another important aggregate that we can make. We hypothesized that if the sum of all tree diameters by zip code were sorted that the top zip code would likely be the zip code with the most trees. We can check that with the `COUNT()` aggregate. This counts each record of a given field and even more helpful will be the `COUNTD()` aggregate. The `D` indicates a distinct count of the given field. We will try this by returning a distinct count of `Tree ID` by zip code.

### Instructions

1. Let’s create some of these charts to understand aggregates better. Open Tableau and connect to your tree census data.
2. Drag the `Trunk Diameter` field to your Rows Shelf. (You can also double click the field name and a pill will be created for you on the Rows Shelf.)
3. Pull `Borough Name` to Columns and click the Swap Rows and Columns button.
4. Drag a new `Borough Name` pill to the Color Shelf. (Hold CTRL while dragging to make a copy of a pill and from the Rows Shelf. Command+drag on a Mac.) Your view should look like this so far:
5. You’ll notice that the aggregation is still using `SUM(Trunk Diameter)`, as this is Tableau’s default, but we know that it isn’t accurate for finding the borough with the largest trunks. Let’s switch that to `AVG(Trunk Diameter)`. Hover on the pill and click the dropdown arrow to open the pill’s menu. Select `Average`.
6. CTRL or Command+drag the `AVG(Trunk Diameter)` pill to the Label shelf.
7. Sort `Borough Name` by field in Descending order with the aggregation of Average.
8. Now we see that Queens has the highest average trunk size in the city with 12.6 inches in diameter. Queens also has the greatest `SUM()` of diameters, but by a much wider margin than when compared by average. With `COUNTD()` of `TreeID` included, we can see that Queens has about 11% more trees than Brooklyn, which accounts for the much larger share of the Sum in the top chart. While the order is the same with each aggregation (this isn’t always the case!), we can understand tree trunk size in NYC better by comparing multiple aggregates.