select()
select()
extracts columns of a data frame and returns the columns as a new data frame. To use select()
, pass it the name of a data frame to extract columns from, and then the names of the columns to extract. The column names do not need to appear in quotation marks or be prefixed with a $
; select()
knows to find them in the data frame that you supply.
Exercise: select()
Use the example below to get a feel for select()
. Can you extract just the name
column? How about the name
and year
columns? How about all of the columns except prop
?
select(babynames, name)
select(babynames, name, year)
select(babynames, year, sex, name, n)
select()
helpers
You can also use a series of helpers with select()
. For example, if you place a minus sign before a column name, select()
will return every column but that column. Can you predict how the minus sign will work here?
The table below summarizes the other select()
helpers that are available in {dplyr}. Study it, and then click “Continue” to test your understanding.
Helper function | Use | Example |
---|---|---|
- |
Columns except | select(babynames, -prop) |
: |
Columns between (inclusive) | select(babynames, year:n) |
contains() |
Columns that contains a string | select(babynames, contains("n")) |
ends_with() |
Columns that ends with a string | select(babynames, ends_with("n")) |
matches() |
Columns that matches a regex | select(babynames, matches("n")) |
num_range() |
Columns with a numerical suffix in the range | Not applicable with babynames |
one_of() |
Columns whose name appear in the given set | select(babynames, one_of(c("sex", "gender"))) |
starts_with() |
Columns that starts with a string | select(babynames, starts_with("n")) |