# Counts

`geom_count()`

Boxplots provide an efficient way to explore the interaction of a continuous variable and a categorical variable. But what if you have two categorical variables?

You can see how observations are distributed across two categorical variables with `geom_count()`

. `geom_count()`

draws a point at each combination of values from the two variables. The size of the point is mapped to the number of observations with this combination of values. Rare combinations will have small points, frequent combinations will have large points.

### Exercise 8: Count plots

Use `geom_count()`

to plot the interaction of the `cut`

and `clarity`

variables in the `diamonds`

data set.

```
ggplot(data = diamonds) +
geom_count(mapping = aes(x = cut, y = clarity))
```

`count()`

You can use the `count()`

function in the {dplyr} package to compute the count values displayed by `geom_count()`

. To use `count()`

, pass it a data frame and then the names of zero or more variables in the data frame. `count()`

will return a new table that lists how many observations occur with each possible combination of the listed variables.

So for example, the code below returns the counts that you visualized in Exercise 8.

```
|>
diamonds count(cut, clarity)
```

```
# A tibble: 40 × 3
cut clarity n
<ord> <ord> <int>
1 Fair I1 210
2 Fair SI2 466
3 Fair SI1 408
4 Fair VS2 261
5 Fair VS1 170
6 Fair VVS2 69
7 Fair VVS1 17
8 Fair IF 9
9 Good I1 96
10 Good SI2 1081
# ℹ 30 more rows
```

### Heat maps

Heat maps provide a second way to visualize the relationship between two categorical variables. They work like count plots, but use a fill color instead of a point size, to display the number of observations in each combination.

### How to make a heat map

{ggplot2} does not provide a geom function for heat maps, but you can construct a heat map by plotting the results of `count()`

with `geom_tile()`

.

To do this, set the x and y aesthetics of `geom_tile()`

to the variables that you pass to `count()`

. Then map the fill aesthetic to the `n`

variable computed by `count()`

. The plot below displays the same counts as the plot in Exercise 8.

```
|>
diamonds count(cut, clarity) |>
ggplot() +
geom_tile(mapping = aes(x = cut, y = clarity, fill = n))
```

### Exercise 9: Make a heat map

Practice the method above by re-creating the heat map below.

```
|>
diamonds count(color, cut) |>
ggplot(mapping = aes(x = color, y = cut)) +
geom_tile(mapping = aes(fill = n))
```

Good job!

### Recap

Boxplots, dotplots and violin plots provide an easy way to look for relationships between a continuous variable and a categorical variable. Violin plots convey a lot of information quickly, but boxplots have a head start in popularity—they were easy to use when statisticians had to draw graphs by hand.

In any of these graphs, look for distributions, ranges, medians, skewness or anything else that catches your eye to change in an unusual way from distribution to distribution. Often, you can make patterns even more revealing with the `fct_reorder()`

function from the {forcats} package (we’ll wait to learn about {forcats} until after you study factors).

Count plots and heat maps help you see how observations are distributed across the interactions of two categorical variables.