103 Data Frame Reshape
Jingwen Bai
103.0.1 Introduction
In this tutorial we will focusing on the reshape of a data frame in R using pacakges reshape2
and tidyr
.
First, we need to know what are wide-format data or long-format data. Wide data has a column for each variable. While long data using all variable as values.
103.0.2 Convert wide-format data to long-format data
103.0.2.1 Method 1: Use melt() in package reshape2
Melt takes wide-format data and melts it into long-format data. Default parameter:
ID variables are the variables that. By default, melt has assumed that all columns with numeric values are variables with values. So here we have player
as id variables.
# assign variable and value's column name
library(reshape2)
df_melt_name <- reshape2::melt(df,id.vars = c("player"), variable.name = "year",value.name = "value")
df_melt_name
Now lets try combined ID variables
103.0.2.2 Method 2: Use gather() in tidyr
The gather() function from the tidyr package can be used to “gather” a key-value pair across multiple columns.
#gather data from columns except player column
df_gather <- tidyr::gather(data = df,key = 'year', value = 'value', -player)
df_gather
Now, let’s try gather values from more than two columns
df_gather_multi <- tidyr::gather(data = df,key = 'year', value = 'value', 3:5)
# df_gather <- tidyr::gather(data = df,key = 'year', value = 'value', -x) <- same
df_gather_multi
103.0.3 Convert long-format data to wide-format data
Now we have learned how to convert from wide-format data to long-format data. Now let’s try the inverse
#### Method 1: Use dcast() in package reshape2
The dcast formula takes the form LHS ~ RHS.
Now let’s revert the df_melt_multi
df_cast_muli <- reshape2::dcast(df_melt_multi, player + classYear ~ year,value.var = 'value')
head(df_cast)