Many tidyverse users turn to the tidyr R package for reshaping data. But I’ve seen people say they can’t remember exactly how its pivot_wider()
and pivot_longer()
functions work. Luckily, there’s an easy solution: RStudio code snippets. Write a snippet once, and what’s basically a fill-in-the-blank form will always be at your fingertips.
From wide to long
To go from wide to long with tidyr, use the pivot_longer()
function. It has the following syntax:
pivot_longer(mydata,
cols, # columns that should pivot from wide to long (unquoted)
names_to, # name of the new category column as a quoted string
values_to # name of the new value column as a quoted string
)
For a simple example, let’s look at the well-known mtcars data set, which has a wide format. It doesn’t have a column category at all — models are row names but not in their own variable — so I used the tibble package’s handy rownames_to_column()
function to add a Model column from the row names:
library(tibble)
library(tidyr)
mtcars <- tibble::rownames_to_column(mtcars, "Model")
head(mtcars) Model mpg cyl disp hp drat wt qsec vs am gear carb 1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
To convert mtcars into “tidy” or long format, all the columns starting from mpg to the last one (carb) should shift into just two new columns — one for category and another for value:
mtcars_long <- pivot_longer(mtcars,
cols = mpg:carb,
names_to = "Category",
values_to = "Value"
)
Below is code for a reusable RStudio code snippet that generates a fill-in-the-blank code explainer for pivot_longer()
whenever you invoke it. This is not R code; it’s snippet code to generate R code:
snippet plonger
pivot_longer(${1:mydf},
cols = ${2:columns to pivot long},
names_to = "${3:desired name for category column}",
values_to = "${4:desired name for value column}"
)
All the items enclosed with ${}
are snippet variables. If you don’t know how RStudio code snippets work, you can watch this tutorial video:
You can add the snippets code to your RStudio R code snippets file with the following R command:
usethis::edit_rstudio_snippets()
That opens the snippets file within RStudio. You can then copy and paste — or type — code into that file. Note that every line under the first line within the snippet and the name of the snippet (plonger) must start with a tab.
To use the snippet, start typing plonger and the snippet name will appear as a choice to select and use. Watch the video embedded at the very top of this article if you want to see the pivot_longer()
snippet in action.
Added bonus: The snippet already includes quotation marks where they’re needed, so you don’t have to add them.
From long to wide
To reshape the other way, use pivot_wider()
:
pivot_wider(mydata,
id_cols, # optional vector of columns you do not want affected
names_from, # category column(s) to pivot from long to wide
values_from # value columns(s) that hold data for each category column
names_sep # optional string separator for category-value columns
)
For this demo I’ll use the us_rent_income data frame, which has data by US state for median annual income and median monthly rent, plus margins of error for each.
GEOID NAME variable estimate moe <chr> <chr> <chr> <dbl> <dbl> 1 01 Alabama income 24476 136 2 01 Alabama rent 747 3 3 02 Alaska income 32940 508 4 02 Alaska rent 1200 13 5 04 Arizona income 27517 148 6 04 Arizona rent 972 4
For a more human-readable version, I’d want income and rent to each have their own columns: income, income margin of error, rent, and rent margin of error.
Code for that:
rent_income_wide <- pivot_wider(us_rent_income,
# id_cols = optional vector of unaffected columns,
names_from = c(variable),
values_from = c(estimate, moe),
names_sep = "_"
)
The RStudio code snippet below will generate reusable fill-in-the-blank code and explanations for each pivot_wider()
argument.
snippet pwider
pivot_wider(${1:mydf},
# id_cols = ${2:optional vector of unaffected columns},
names_from = c(${3:category column(s) to pivot wide}),
values_from = c(${4:value column(s) that hold data for each category column}),
names_sep = "_"
)
You can download a text file with both pivot snippets below. Right-click and save the link as a file on your system; don’t copy and paste from the file displayed in a browser.
For more R tips, head to the InfoWorld Do More With R page.