arrange()

arrange() returns all of the rows of a data frame reordered by the values of a column. As with select(), the first argument of arrange() should be a data frame and the remaining arguments should be the names of columns. If you give arrange() a single column name, it will return the rows of the data frame reordered so that the row with the lowest value in that column appears first, the row with the second lowest value appears second, and so on. If the column contains character strings, arrange() will place them in alphabetical order.

Exercise: arrange()

Use the code chunk below to arrange babynames by n. Can you tell what the smallest value of n is?

arrange(babynames, n)

Good job! The compiler of babynames used 5 as a cutoff; a name only made it into babynames for a given year and gender if it was used for five or more children.

Tie breakers

If you supply additional column names, arrange() will use them as tie breakers to order rows that have identical values in the earlier columns. Add to the code below, to make prop a tie breaker. The result should first order rows by value of n and then reorder rows within each value of n by values of prop.

arrange(babynames, n, prop)

desc()

If you would rather arrange rows in the opposite order, i.e. from large values to small values, surround a column name with desc(). arrange() will reorder the rows based on the largest values to the smallest.

Add a desc() to the code below to display the most popular name for 2017 (the largest year in the dataset) instead of 1880 (the smallest year in the dataset).

arrange(babynames, desc(year), desc(prop))

Think you have it? Click Continue to test yourself.

arrange() quiz

Which name was the most popular for a single gender in a single year? In the code chunk below, use arrange() to make the row with the largest value of prop appear at the top of the data set.

arrange(babynames, desc(prop))

Now arrange babynames so that the row with the largest value of n appears at the top of the data frame. Will this be the same row? Why or why not?

arrange(babynames, desc(n))

The number of children represented by each proportion grew over time as the population grew.

Next topic