Flexibly Reshape Data: A Reboot of the Reshape Package

Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast').


Build Status Coverage status

Status

lifecycle

reshape2 is retired: only changes necessary to keep it on CRAN will be made. We recommend using tidyr instead.

Introduction

Reshape2 is a reboot of the reshape package. It's been over five years since the first release of reshape, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much more focused and much much faster.

This version improves speed at the cost of functionality, so I have renamed it to reshape2 to avoid causing problems for existing users. Based on user feedback I may reintroduce some of these features.

What's new in reshape2:

  • considerably faster and more memory efficient thanks to a much better underlying algorithm that uses the power and speed of subsetting to the fullest extent, in most cases only making a single copy of the data.

  • cast is replaced by two functions depending on the output type: dcast produces data frames, and acast produces matrices/arrays.

  • multidimensional margins are now possible: grand_row and grand_col have been dropped: now the name of the margin refers to the variable that has its value set to (all).

  • some features have been removed such as the | cast operator, and the ability to return multiple values from an aggregation function. I'm reasonably sure both these operations are better performed by plyr.

  • a new cast syntax which allows you to reshape based on functions of variables (based on the same underlying syntax as plyr):

  • better development practices like namespaces and tests.

  • the function melt now names the columns of its returned data frame Var1, Var2, ..., VarN instead of X1, X2, ..., XN.

  • the argument variable.name of melt replaces the old argument variable_name.

Initial benchmarking has shown melt to be up to 10x faster, pure reshaping cast up to 100x faster, and aggregating cast() up to 10x faster.

This work has been generously supported by BD (Becton Dickinson).

Installation

  • Get the released version from cran: install.packages("reshape2")
  • Get the dev version from github: devtools::install_github("hadley/reshape")

News

Version 1.4.3

  • Fix C/C++ problems causing R CMD CHECK errors.

  • melt.data.frame() throws when encountering objects of type POSIXlt, and requests a conversion to the (much saner) POSIXct type.

Version 1.4.2

  • Minor R CMD check fixes for CRAN.

Version 1.4.1

  • melt.data.frame() now properly sets the OBJECT bit on value variable generated if attributes are copied (for example, when multiple POSIXct columns are concatenated to generate the value variable) (#50)

  • melt.data.frame() can melt data.frames containing list elements as id columns. (#49)

  • melt.data.frame() no longer errors when measure.vars is NULL or empty. (#46)

Version 1.4

  • dcast() and acast() gain a useful error message if you use value_var intead of value.var (#16), and if value.var doesn't exist (#9). They also work better with . in specifications like . ~ . or x + y ~ .

  • melt.array() creates factor variables with levels in the same order as the original rownames (#19)

  • melt.data.frame() gains an internal Rcpp / C++ implementation, and is now many orders of magnitudes faster. It also preserves identical attributes for measure variables, and now throws a warning if they are dropped. (Thanks to Kevin Ushey)

  • melt.data.frame() gains a factorsAsStrings argument that controls whether factors are converted to character when melted as measure variables. This is TRUE by default for backward compatibility.

  • melt.array() gains a as.is argument which can be used to prevent dimnames being converted with type.convert()

  • recast() now returns a data frame instead of a list (#45).

Version 1.2.2

  • Fix incompatibility with plyr 1.8

  • Fix evaluation bug revealed by knitr. (Fixes #18)

  • Fixed a bug in melt where it didn't automatically get variable names when used with tables. (Thanks to Winston Chang)

Version 1.2.1

  • Fix bug in multiple margins revealed by plyr 1.7, but caused by mis-use of data frame subsetting.

Version 1.2

  • Fixed bug in melt where factors were converted to integers, instead of to characters

  • When the measured variable is a factor, dcast now converts it to a character rather than throwing an error. acast still returns a factor matrix. (Thanks to Brian Diggs.)

  • acast is now much faster, due to fixing a very slow way of naming the output. (Thanks to José Bartolomei Díaz for the bug report)

  • value_var argument to acast and dcast renamed to value.var to be consistent with other argument names

  • Order NA factor levels before (all) when creating margins

  • Corrected reshape citation.

Version 1.1

  • melt.data.frame no longer turns characters into factors

  • All melt methods gain a na.rm and value.name arguments - these previously were only possessed by melt.data.frame (Fixes #5)

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("reshape2")

1.4.3 by Hadley Wickham, 2 months ago


https://github.com/hadley/reshape


Report a bug at https://github.com/hadley/reshape/issues


Browse source code at https://github.com/cran/reshape2


Authors: Hadley Wickham <[email protected]>


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports plyr, Rcpp, stringr

Suggests covr, lattice, testthat

Linking to Rcpp


Imported by ABHgenotypeR, AMR, AlignStat, AppliedPredictiveModeling, BACCT, BAwiR, BBEST, BEQI2, BacArena, BatchMap, BinarybalancedCut, CANSIM2R, ChainLadder, CityWaterBalance, ClimClass, Cluster.OBeu, CoDiNA, CopulaDTA, Cubist, DGM, DVHmetrics, DataExplorer, DeLorean, DescribeDisplay, EEM, EasyHTMLReport, EcoGenetics, EpiDynamics, EvolutionaryGames, Evomorph, FinCal, ForeCA, ForecastFramework, FreqProf, G2Sd, GCalignR, GD, GetDFPData, GetITRData, HBP, HH, HLMdiag, HRM, IATscores, IGM.MEA, IMTest, ITNr, InterfaceqPCR, LambertW, MRMR, MetaComp, MiRAnorm, MixSIAR, Mobilize, Momocs, MultiMeta, NMF, NPflow, NetworkComparisonTest, NeuralNetTools, OpasnetUtils, OutbreakTools, PCADSC, PSAboot, PTXQC, PWFSLSmoke, PhenotypeSimulator, Plasmidprofiler, PlotPrjNetworks, PredictTestbench, RAM, RDS, RNeXML, RSSL, RStoolbox, RcmdrPlugin.FuzzyClust, RelimpPCR, Rnightlights, SCORPIUS, SEERaBomb, SIRItoGTFS, STMedianPolish, SWMPr, SensMixed, SensoMineR, Seurat, ShinyItemAnalysis, SixSigma, SoyNAM, StMoMo, TSMining, TSstudio, TcGSA, TippingPoint, TripleR, TropFishR, Umatrix, UncertainInterval, VDAP, aLFQ, adegenet, advclust, afex, algstat, anchoredDistr, aoristic, apsimr, aslib, assignPOP, backShift, bayesPop, bayesplot, bikedata, bioplots, blkbox, blockseg, bmmix, boclust, broom, bulletr, burnr, cRegulome, cancerGI, capm, caret, cellWise, childsds, classify, clhs, clifro, clustMD, clusterfly, coefplot, cooccur, covmat, cplm, cutoffR, dartR, data360r, dataRetrieval, dbhydroR, dcmr, deconstructSigs, dendroTools, denovolyzeR, desplot, detectRUNS, drfit, dtwSat, dtwclust, dynr, econullnetr, ecr, effectR, elasticIsing, emdi, enpls, evolqg, exreport, extracat, ez, fChange, fSRM, factoextra, factorMerger, fcm, fergm, flows, fmriqa, forecastHybrid, frailtySurv, freesurfer, funModeling, gdm, genBaRcode, genBart, genotypeR, ggcorrplot, gge, ggedit, ggenealogy, ggiraphExtra, gglogo, ggmap, ggparallel, ggplot2, ggspatial, ggspectra, granovaGG, graphTweets, grapherator, gridsampler, growcurves, growfunctions, hR, hazus, heatmaply, hybridModels, hyfo, iNEXT, icr, imputeR, imputeTestbench, iprior, ivmodel, kehra, laketemps, lans2r, lavaSearch2, ldatuning, likert, lmms, lsbclust, lsl, magclass, mandelbrot, mapStats, marmap, mcMST, mdpeer, meaRtools, medicalrisk, mem, metacoder, metaheur, mixOmics, mizer, morse, mortAAR, mplot, mrfDepth, multdyn, mvdalab, myTAI, narray, ncappc, neotoma, networkreporting, networktools, oaxaca, obAnalytics, onemap, openair, optiSel, orderedLasso, outreg, pRF, patPRO, pdfetch, pdolsms, photobiologyInOut, pinbasic, planar, plsgenomics, polypoly, pompom, powerbydesign, pqantimalarials, preText, prepdat, preprosim, preproviz, proteomics, psData, pscore, psychmeta, ptycho, qdap, qgraph, quadrupen, rSQM, rWBclimate, rYoutheria, randomForestExplainer, rbi, rclimateca, rdiversity, refund.shiny, rfPermute, rmcfs, rplos, rusk, rwty, sValues, santaR, scmamp, sharpshootR, shinyKGode, shinystan, simmr, skeleSim, snht, soc.ca, soilDB, sorvi, soundgen, sourceR, sparsevar, spatialwarnings, speaq, spectacles, sprm, stability, stacomiR, structSSI, svars, svdvis, swfscMisc, sysid, taRifx, taxize, tetraclasse, timeseriesdb, timma, toaster, treeDA, tsiR, tvm, vanddraabe, veccompare, vmd, vpc, wTO, warpMix, widyr, wppExplorer, wql, xsp, xxIRT, yorkr, zebu, zonator.

Depended on by AurieLSHGaussian, BCellMA, CINOEDV, CompetingRisk, ESGtoolkit, JAGUAR, ScottKnottESD, SpatialFloor, TriMatch, clickstream, difNLR, diverse, gapmap, ifaTools, metaforest, mhtboot, mmppr, mudfold, pxR, reshapeGUI, tcR, tmpm, toolmaRk, validateRS, wordmatch, zoocat.

Suggested by ANN2, ARPobservation, DataVisualizations, GSODR, GeneralizedUmatrix, Information, Lahman, MGLM, MOEADr, MTA, PDQutils, ParamHelpers, Perc, ProjectionBasedClustering, RDML, RNOmni, Rlda, Tmisc, agridat, alluvial, bmlm, bodenmiller, bridgedist, causaldrf, codyn, data.table, flowr, frequencyConnectedness, funData, ggQC, ggalt, ggsci, ggswissmaps, ggthemes, glmmTMB, hdf5r, heplots, heuristica, httk, iheatmapr, knitrBootstrap, lda, logitnorm, ltbayes, metafolio, microplot, mlxR, mmpf, mosaicData, nLTT, neurobase, nlmixr, nullabor, pals, pdSpecEst, physiology, polymapR, pomp, popEpi, productplots, propr, psd, ragtop, rangemodelR, refund, rfordummies, rgbif, rmetasim, robustbase, robustlmm, rpf, rtop, scanstatistics, sdmpredictors, sensitivity, shadow, snpReady, socialmixr, sparseMVN, spew, ss3sim, tableone, tictactoe, tidytext, tourr, treecm, treespace, tstools, tukeytrend, vcfR, vkR, xltabr, xtractomatic.


See at CRAN