Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast').
reshape2 is retired: only changes necessary to keep it on CRAN will be made. We recommend using tidyr instead.
Reshape2 is a reboot of the reshape package. It's been over five years since the first release of reshape, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much more focused and much much faster.
This version improves speed at the cost of functionality, so I have renamed it to reshape2
to avoid causing problems for existing users. Based on user feedback I may reintroduce some of these features.
What's new in reshape2
:
considerably faster and more memory efficient thanks to a much better underlying algorithm that uses the power and speed of subsetting to the fullest extent, in most cases only making a single copy of the data.
cast is replaced by two functions depending on the output type: dcast
produces data frames, and acast
produces matrices/arrays.
multidimensional margins are now possible: grand_row
and grand_col
have
been dropped: now the name of the margin refers to the variable that has
its value set to (all).
some features have been removed such as the |
cast operator, and the
ability to return multiple values from an aggregation function. I'm
reasonably sure both these operations are better performed by plyr.
a new cast syntax which allows you to reshape based on functions of variables (based on the same underlying syntax as plyr):
better development practices like namespaces and tests.
the function melt
now names the columns of its returned data frame Var1
, Var2
, ..., VarN
instead of X1
, X2
, ..., XN
.
the argument variable.name
of melt
replaces the old argument variable_name
.
Initial benchmarking has shown melt
to be up to 10x faster, pure reshaping cast
up to 100x faster, and aggregating cast()
up to 10x faster.
This work has been generously supported by BD (Becton Dickinson).
install.packages("reshape2")
devtools::install_github("hadley/reshape")
Fix C/C++ problems causing R CMD CHECK errors.
melt.data.frame()
throws when encountering objects of type POSIXlt
,
and requests a conversion to the (much saner) POSIXct
type.
melt.data.frame()
now properly sets the OBJECT bit on value
variable
generated if attributes are copied (for example, when multiple POSIXct
columns are concatenated to generate the value
variable) (#50)
melt.data.frame()
can melt data.frame
s containing list
elements as id
columns. (#49)
melt.data.frame()
no longer errors when measure.vars
is NULL
or empty.
(#46)
dcast()
and acast()
gain a useful error message if you use value_var
intead of value.var
(#16), and if value.var
doesn't exist (#9). They
also work better with .
in specifications like . ~ .
or
x + y ~ .
melt.array()
creates factor variables with levels in the same order
as the original rownames (#19)
melt.data.frame()
gains an internal Rcpp / C++ implementation, and
is now many orders of magnitudes faster. It also preserves identical
attributes for measure variables, and now throws a warning if they are
dropped. (Thanks to Kevin Ushey)
melt.data.frame()
gains a factorsAsStrings
argument that controls whether
factors are converted to character when melted as measure variables. This
is TRUE
by default for backward compatibility.
melt.array()
gains a as.is
argument which can be used to prevent
dimnames being converted with type.convert()
recast()
now returns a data frame instead of a list (#45).
Fix incompatibility with plyr 1.8
Fix evaluation bug revealed by knitr. (Fixes #18)
Fixed a bug in melt
where it didn't automatically get variable names
when used with tables. (Thanks to Winston Chang)
Fixed bug in melt where factors were converted to integers, instead of to characters
When the measured variable is a factor, dcast
now converts it to a
character rather than throwing an error. acast
still returns a factor
matrix. (Thanks to Brian Diggs.)
acast
is now much faster, due to fixing a very slow way of naming the
output. (Thanks to José Bartolomei Díaz for the bug report)
value_var
argument to acast
and dcast
renamed to value.var
to be
consistent with other argument names
Order NA
factor levels before (all)
when creating margins
Corrected reshape citation.
melt.data.frame
no longer turns characters into factors
All melt methods gain a na.rm
and value.name
arguments - these
previously were only possessed by melt.data.frame
(Fixes #5)