Various R programming tools for data manipulation, including: - medical unit conversions ('ConvertMedUnits', 'MedUnits'), - combining objects ('bindData', 'cbindX', 'combine', 'interleave'), - character vector operations ('centerText', 'startsWith', 'trim'), - factor manipulation ('levels', 'reorder.factor', 'mapLevels'), - obtaining information about R objects ('object.size', 'elem', 'env', 'humanReadable', 'is.what', 'll', 'keep', 'ls.funs', 'Args','nPairs', 'nobs'), - manipulating MS-Excel formatted files ('read.xls', 'installXLSXsupport', 'sheetCount', 'xlsFormats'), - generating fixed-width format files ('write.fwf'), - extricating components of date & time objects ('getYear', 'getMonth', 'getDay', 'getHour', 'getMin', 'getSec'), - operations on columns of data frames ('matchcols', 'rename.vars'), - matrix operations ('unmatrix', 'upperTriangle', 'lowerTriangle'), - operations on vectors ('case', 'unknownToNA', 'duplicated2', 'trimSum'), - operations on data frames ('frameApply', 'wideByFactor'), - value of last evaluated expression ('ans'), and - wrapper for 'sample' that ensures consistent behavior for both scalar and vector arguments ('resample').
Requirement for Perl version 5.10.0 or later is specified in the package DESCRITION.
first() and last() are now simply wrappers for calls to 'head(x, n=1)' and 'tail(x, n=1)', respectively.
New functions first() and last() to return the first or last element of an object.
New functions left() and right() to return the leftmost or rightmost n (default to 6) columns of a matrix or dataframe.
New 'scientific' argument to write.fwf(). Set 'scientific=FALSE' to prevent numeric columns from being displayed using scientific notification.
The 'standard' argument to humanReadable() now accepts three values, 'SI' for base 1000 ('MB'), 'IEC' for base 1024 ('MiB'), and 'Unix' for base 1024 and single-character units ('M')
object.size() now returns objects with S3 class 'object_sizes' (note the final 's') to avoid conflicts with methods in utils for class 'object_size' provided by package 'utils' which can only handle a scalar size.
New 'units' argument to humanReadable()--and hence to print.object_sizes() and format.object_sizes()--that permits specifying the unit to use for all values. Use 'bytes' to display all values with the unit 'bytes', use 'auto' (or leave it missing) to automatically select the best unit, and use a unit from the selected standard to use that unit (i.e. 'MiB').
The default arguments to humanReadable() have changed. The defaults are now 'width=NULL' and 'digits=1', so that the default behavior is now to show digit after the decimal for all values.
reorder.factor() was ignoring the argument 'X' unless 'FUN' was supplied, making it incompatible with the behavior of stats:::reorder.default(). This has been corrected, so that calling reorder on a factor with arguments 'X' and/or 'FUN' should now return the same results whether gdata is loaded or not. (Reported by Sam Hunter.)
write.fwf() now properly supports matrix objects, including matrix objects without column names. (Reported by Carl Witthoft.)
Replaced deprecated PERL function POSIX::isdigit in xls2csv.pl (which is used by read.xls() ) with an equivalent regular expression. (Reported by both Charles Plessy, Gerrit-jan Schutten, and Paul Johnson. Charles also provided a patch to correct the issue.)
aggregate.table(), which has been defunct gdata 2.13.3 (2014-04-04) has now been completely removed.
read.xls() can now properly process XLSX files with up to 16385 columns (the maximum generated by Microsoft Excel).
read.xls() now properly handles XLS/XLSX files that use 1904-01-01 as the reference value for dates instead of 1900-01-01 (the default for MS-Excel files created on the Mac).
'aggregate.table' is now defunct. See '?gdata-defunct' for details.
Unit tests and vignettes now follow R standard practice.
Minor changes to clean up R CMD check warnings.
Simplify ll() by converting a passed list to an environment, avoiding the need for special casing and the use of attach/detach.
Working of deprecation warning message in aggregate.table clarified.
New 'duplicated2' function which returns TRUE for all elements that are duplicated, including the first, contributed by Liviu Andronic. This differs from 'duplicated', which only returns the second and following (second-to last and previous when 'fromLast=TRUE') duplicate elements.
New 'ans' functon to return the value of the last evaluated top-level function (a convenience function for accessing .Last.value), contributed by Liviu Andonic.
On windows, warning messages printed to stdout by perl were being included in the return value from 'system', resulting in errors in 'sheetCount' and 'sheetNames'. Corrected.
The 'MedUnits' column names 'SIUnits' and 'ConventionalUnits' were reversed and misspelled.
Mark example for installXLSsupport() as dontrun so R CMD check won't fail on systems where PERL is not fully functional.
Correct name of installXLSsupport() in tests/test.read.xls.R.
New ls.funs() function to list all objects of class function in the specified environment.
New startsWith() function to determine if a string "starts with" the specified characters.
Add centerText() function to center text strings for a specified width.
Add case() function, a vectorized variant of the base::switch() function, which is useful for converting numeric codes into factors.
Modify write.fwf() to capture and pass on additional arguments for write.table(). This resolves a bug reported by Jan Wijffels.
Modify xls2sep.R to avoid use of file.access() which is unreliable on Windows network shares.
When loaded, gtools (via an .onAttach() function) now checks:
If perl is not available, an appropriate warning message is displayed.
If necessary perl libraries are not available, a warning message is displayed, as is a message suggesting the user run the (new) installXLSXsupport() function to attempt to install the necessary perl libraries.
The function installXLSXsupport() has been provided to install the binary perl modules that read.xls needs to support Excel 2007+ 'XLSX' files.
No longer attempt to install perl modules Compress::Raw::Zlib and Spreadsheet::XLSX at build/compile time. This should resolve recent build issues, particularly on Windows.
All perl code can now operate (but generate warnings) when perl modules Compress::Raw::Zlib and Spreadsheet::XLSX when are not installed.
Also update Greg's email address.
read.xls() now supports Excel 2007 'xlsx' files.
read.xls() now allows specification of worksheet by name
read.xls() now supports ftp URLs.
Improved ll() so user can limit output to specified classes
Fix formatting warning in frameApply().
Resolve crash of "ll(.GlobalEnv)"
Correct minor typos & issues in man pages for write.fwf(), resample() (Greg Warnes)
Correct calculation of object sizes in env() and ll() (Gregor Gorjanc)
Add support for using tab for field separator during translation from xls format in read.xls (Greg Warnes)
Enhanced function object.size that returns the size of multiple objects. There is also a handy print method that can print size of an object in "human readable" format when options(humanReadable=TRUE) or print(object.size(x), humanReadable=TRUE). (Gregor Gorjanc)
New function wideByFactor that reshapes given dataset by a given factor - it creates a "multivariate" data.frame. (Gregor Gorjanc)
New function nPairs that gives the number of variable pairs in a data.frame or a matrix. (Gregor Gorjanc)
New functions getYear, getMonth, getDay, getHour, getMin, and getSec for extracting the date/time parts from objects of a date/time class. (Gregor Gorjanc)
New function bindData that binds two data frames into a multivariate data frame in a different way than merge. (Gregor Gorjanc)
New function .runRUnitTestsGdata that enables run of all RUnit tests during the R CMD check as well as directly from within R.
Enhanced function object.size that returns the size of multiple objects. There is also a handy print method that can print size of an object in "human readable" format when options(humanReadable=TRUE) or print(x, humanReadable=TRUE).
New function bindData that binds two data frames into a multivariate data frame in a different way than merge.
New function wideByFactor that reshapes given dataset by a given factor - it creates a "multivariate" data.frame.
New functions getYear, getMonth, getDay, getHour, getMin, and getSec for extracting the date/time parts from objects of a date/time class.
New function nPairs that gives the number of variable pairs in a data.frame or a matrix.
New function trimSum that sums trimmed values.
New function cbindX that can bind objects with different number of rows.
write.fwf gains the width argument. The value for unknown can increase or decrease the width of the columns. Additional tests and documentation fixes.
Enhancements and bug fixes for read.xls() and xls2csv():
More informative log messages when verbose=TRUE
File paths containing spaces or other non-traditional characters are now properly handled
Better error messages, particularly when perl fails to generate an output .csv file.
The 'shortcut' character "~" (meaning user's home directory) is now properly handled in file paths.
XLS files created by OpenOffice are now properly handled. Thanks to Robert Burns for pointing out the patch (http://rt.cpan.org/Public/Bug/Display.html?id=7206)
Update perl libraries needed by xls2csv() and read.xls() to latest available versions on CRAN.
Add read.xls() to exported function list
Correct iris.xls example file. It didn't contain the complete & properly formatted iris data set. Fixed.
Fix typo in win32 example for read.xls()
The keep() function now includes an 'all' argument to specify how objects with names starting with '.' are handled.
keep() now shows an informative warning message when a requested object does not exist
New vignette "Mapping Levels of a Factor" describing the use of mapLevels().
New vignette "Working with Unknown Values" describing the use of isUnknown() and unknownToNA().
Several enhancements to read.xls() (thanks to Gabor Grothendieck):
New function xls2csv(), which handles converting an xls file to a csv file and returns a connection to the temporary csv file
xls2csv() and read.xls() both allow a file or a url to be specified
read.xls() has a new 'pattern' argument which, if supplied, will ignore everything prior to the first line in th csv file that matches the pattern. This is typically used if there are a variable number of comment lines prior to the header in which case one can specify one of the column headings as the pattern. read.xls should be compatible with the old read.xls.
Minor fixes to drop.levels(), is.what().
Implementation of unit tests for most functions.
Arguments as well as their position of reorder.factor have been changed to conform with reorder.factor method in stats package, due to collision bug. Argument 'make.ordered' is now 'order' and old argument 'order' is now 'new.order'! Therefore, you have to implicitly specify new.order i.e.
reorder(trt, new.order=c("PLACEBO", "300 MG", "600 MG", "1200 MG"))
trim() gains ... argument.
Added "unknown" methods for matrices.
Added c() method for factors based on mapLevels() functions.
Added write.fwf, which writes file in Fixed Width Format.
Added mapLevels(), which produces a map with information on levels and/or internal integer codes. Contributed by Gregor Gorjanc.
Extended dropLevels() to work on the factors contained in a data frame, as well as individual factors.
Add unknown(), which changes given unknown value to NA and vice versa. Contributed by Gregor Gorjanc.
Extended trim() to handle a variety of data types data.frames, lists, factors, etc. Code changes contributed by Gregor Gorjanc.
Added resample() command that acts like sample() except that it always samples from the arguments provided, even if only a single argument is present. This differs from sample() which behaves differently in this case.
Updated my email address.
Fixed bug in interleave.R - option to covert 1-column matrices to vector (based on Andrew Burgess's suggestion)
Updated Greg and Jim's email adresses
ll.R: Suppressed warning message in attach() call.
frameApply.Rd, reorder.Rd: Remove explicit loading of gtools in examples, so that failure to import functions from gtools gets properly caught by running the examples.
upperTriangle.R, man/upperTriangle.Rd: Add functions for extracting and modifying the upper and lower trianglular components of matrices.
is.what.R: Replaced the "not.using" vector with a more robust try(get(test)) to find out whether a particular is.* function returns a logical of length one.
DESCRIPTION: Added Suggests field
Updated the example in frameApply
Added DESCRIPTION and removed DESCRIPTION.in
Updated ll.Rd documentation
Fixed bug in Args.R, is.what.R, ll.R