Each function accomplishes the work of several or more standard R functions. For example, two function calls, Read() and CountAll(), read the data and generate summary statistics for all variables in the data frame, plus histograms and bar charts as appropriate. Other functions provide for descriptive statistics, a comprehensive regression analysis, analysis of variance and t-test, plotting including the introduced here Violin/Box/Scatter plot for a numerical variable, bar chart, histogram, box plot, density curves, calibrated power curve, reading multiple data formats with the same function call, variable labels, color themes, Trellis graphics and a built-in help system. A confirmatory factor analysis of multiple indicator measurement models is available, as are pedagogical routines for data simulation such as for the Central Limit Theorem. Compatible with 'RStudio' and 'knitr' including generation of R markdown instructions for interpretative output.

NEWS

Changes for lessR version 3.5.1 (2016-10-19) <<<<<<

Plot Cleveland dot plot displayed with segments.y and color.grid="off" as default 2-D scatterplot smoothing option, smoothed, turned on by default for n>=5000 for a line plot, input variable can be a time series speeded up the processing of bubble plots time series dates plot more generally and more cleanly

Plot if more than 2 x-variables, sort of y by x (sort.yx), which would not be meaningful, causes an error and so is not attempted if no sort requested for Cleveland dot plot, no alphabetical ordering a requested color gradient in a bubble plot works for integer variables as well as factors prop as a topic works in place of counts for bar charts and histograms multiple ellipses display when color.ellipse is specified

LineChart time series plots even if missing data

Changes for lessR version 3.5.0 (2016-08-29) <<<<<<

Density gray color theme now has a light-gray, transparent fill for the general density curve, with the normal curve still no fill histogram option can be set to FALSE for no background histogram

Plot size option can be a variable, which triggers a bubble plot with the size of each bubble scaled according to the values of the variable size bubble.count option renamed to the more general bubble.text bubble.size option renamed to the more accurate bubble.scale when plotting two variables against a third with the gray color theme, both lines or sets of points retain gray scale

Read when detects an illegal character in a variable name, it is removed and the program continues

ANOVA brief version, av.brief, works

BarChart col.bg may be set to "off" as intended

CountAll quiet applies to all output

Plot bubble plot for within column proportions available

Regression for a single predictor, confidence and predictor intervals properly plot

Changes for lessR version 3.4.8 (2016-05-01) <<<<<<

Graphics routines color.box option added at the global level with theme and locally, which provides for the border of the box around the plot set a color option to "off" sets it to "transparent"

ScatterPlot native RStudio scaling smaller than regular R, now adjusted multiple x-variables allowed for continuous variables in addition to categorical variables multiple y-variables allowed, but not both multiple x and y fit lines (loess or ls) for multiple variables argument for object named regular changed to the more descriptive name of point option name stat changed to topic, with a default of data options that started with the abbreviation col now start with color option sort.y renamed to sort.yx to be more descriptive option sort.yx sorts y by x for a single x-variable, and by x2-x1 for two x-variables option size renamed from the R notation cex for scaling factor of plotted symbol option object replaces more restrictive option type topic arguments include proportion, median, diff for difference row.names can given for the x-axis as well as the y-axis one or more continuous x-variables and no y-variable with a line object results in a run chart(s) with Index on the x axis segments.y for a Cleveland dot plot with two variables specifies to join each pair of points with a line segment diag option for diagonal line dropped because replaced with a Cleveland dot plot for two variables means are plotted for both categorical x and numeric y and vice versa suggestions often provided for alternate visualizations

theme suggest option added, which provides suggestions for additional input (currently implemented for ScatterPlot)

ttest confidence interval of standardized mean difference dropped because it relied upon the MBESS package, which added too many dependencies diagonal line plot replaced with two variable Cleveland dot plot

BarChart for small number of levels, bars again now scaled narrow

ScatterPlot line argument works for option object

Changes for lessR version 3.4.6 (2016-03-27) <<<<<<

BarChart, ScatterPlot, SummaryStats a numeric variable with less than n.cat unique values considered categorical only for values that are all integer

BarChart addtop is a multiplicative factor for expanding room between the highest bar and the top of the plot, instead of additive, also a little more space added by default for 1-variable plots by setting the default to 0.05 of total height addtop now provides a buffer also for horizontal graphs prop=TRUE for two variables provides column proportions instead of for rows for consistency, count.levels now referred to as count.labels

Read variable names are checked for invalid characters in text files and Excel files, which R does not do

set re-named theme

SummaryStats, BarChart, ScatterPlot for 2 variable cross-tab analysis, if there is no p-value because the cross-tab table is not well-formed, such as too many 0's, appropriately indicated in the output

ScatterPlot trans.fill option added to set trans.fill.pt in the function call, can still be set globally from function theme if y-values are unique, as in a Cleveland dot plot, default transparency level is 0 because no over-plotting, though can be set from trans.fill option also bubble.power option provides larger bubbles for smaller frequencies and allows the user to provide a custom value bubble plot applies also to numeric variables option kind renamed style "off" added as a value of style, that is, do not plot the data values stat option added to produce a scatter plot of statistics such as the mean of a continuous variable against levels of a categorical value, or counts of a categorical variable, instead of the original data sort.y added to sort y-values by x-values, for Cleveland dot plot when y is set to row.names, y becomes the row names of the data table segments.y and segments.x options for line segments from axis to points

theme new name for set for default gray theme, base color for point fill and stroke is darker

BarChart stats for two variables prop (proportion) option works beside=TRUE, values properly labeled on x-axis

ScatterPlot a factor with more levels than unique values displays properly large values of frequencies display properly by legend displays properly with black background colors extreme outlier points plotted with ellipse

Changes for lessR version 3.4.4 (2016-03-04) <<<<<<

Graphics general long variable labels printed on graphs with full text, made multi-line and also size shrunk if needed to fit xlab and ylab arguments also printed multi-line, size shrunk if needed variable name pre-pended to the displayed variable label default tick labels size reduced from 0.85 to 0.75, white space at top of graph reduced if no title rotate.values and offset options provided to rotate axis values so as to provide more space for the label numerical axes value labels all displayed with same number of decimal digits

BarChart for two variables, prop=TRUE plots the row proportion cell frequencies, which are now also displayed in the text output value.labels option added to provide labels other than the existing values

LineChart individual runs not displayed by default, use show.runs to display

ScatterPlot can specify ellipse level including a vector of values to plot multiple ellipses on the same scatter plot allows bubble plot for two categorical variables in addition to the already categorical x variable, and both x and y numeric for traditional scatter plot or small number of integer values for a bubble plot if bubble is large enough in a bubble plot, include the frequency displays a 1-D bubble plot for a single factor variable scatter plot of one variable more narrow and centered in plot window introduces the Bubble Matrix Frequency Plot for Likert-type data in which multiple x-variables display a bubble plot of frequencies for the responses for multiple variables bubble plot from small number of unique numeric values under user control as set with n.cat, default=10 unique values of a variable means plot with categorical x-axis, lines of means are darker, points transparent, and the points for means are darker (or lighter) summary stats output of stat analysis for each type of scatter plot bubble plot displays corresponding counts, controlled by bubble.counts labels option added to provide labels other than the existing values for non-numeric variables fit.line can be set to TRUE without specifying a specific best-fit line, which provides a loess best-fit line value.labels option added to provide labels other than the existing values alternate names of DotPlot or dp for a 1-variable plot removed

Merge parameters from the R merge function can be passed through, such as all.x=TRUE

Read brief version of output now default, use details function to full version

Regression new name for the generated R markdown file is Rmd instead of knitr.file if data standardized, then so indicated on the output

BarChart when counts directly specified in a file, count.levels labels correct variable name beside=TRUE now works for 2-variable plots prop=TRUE now works for 1-variable plots two global variables now correctly produce 2-variable plot

corCFA item content properly displays

Help superfluous graphics window no longer opens for Help(lessR)

Regression if no predictors are not significant at p<.05, analysis now proceeds to generate Rmd file the names of collinear variables now listed in output of Rmd file

ScatterPlot y variable now correctly re-defined according to n.cat when specified

ttestPower values correctly passed for plotting power curve

Changes for lessR version 3.4 (2015-12-27) <<<<<<

error trapping more development of lessR error trapping to replace the more cryptic R error messages with more understandable messages that also provide guidance as how to correct the problem of existing lessR explanations and the following additions: 1. specifying variables to analyze that do not exist in the data table 2. specifying variables to analyze without having a data table 3. naming the intended default data table Mydata instead of mydata 4. calling a data frame in place of a variable in ttest 5. improperly enclosing a variable name in quotes in a function call 6. failing to specify a variable to analyze in ttest and ScatterPlot 7. trying a histogram for a categorical variable 8. trying a scatter plot with the second variable non-numeric

BarChart colors changed for two variable plots, now based on hues generated by rainbow_hcl(24,c=38,l=75) from the colorspace package, such that when desaturated all colors have the same shade of gray proportions option now available for 2 variable plots phi coefficient or Cramer's V displayed with two variable analyses

PieChart frequency distribution added so that text output is same as BarChart colors same as BarChart for two variables

SummaryStats chi-square test provided phi coefficient or Cramer's V displayed with two variable analyses

Help spacing improved with shorter lines of output

Logistic collinearity analysis restored

Regression subsets option can be an integer to specify maximum number of lines displayed, where each line represents a specific subset model for subsets of more than 40 lines, the variable names are written each 30 lines scatter plot matrix adjusts the size of the correlation coefficients depending on the number of predictors better labeling of subsets output to indicate that only the best 10 models of each number of predictors are considered, when relevant

Read, details More concise output Hadley Wickham's read_excel function restored for reading Excel files, and Read re-interprets the variable types from read_excel so that they are equivalent to those from reading other file formats with Read

ttest for paired analysis, difference score now computed from subtracting first variable from second variable

ScatterPlot ellipse option restored, with axes automatically reset to provide for values of the ellipse that exceed the range of values of the data, and fill.ellipse color can be specified, usually with partial transparency such as rgb(.8,.8,.8,.2) correlation analysis restored if xy.ticks is FALSE, then axis labels moved closer to the plot ellipse option applies to bubble/sunflower plot

VariableLabels new function that essentially replaces label function, with new features of reading a file of variable names and labels separately from the Read function, and also from the console

BarChart legend printed in light text if background is dark

Histogram, SummaryStats outlier analysis for small outliers improved

Merge variable units now properly processed

ScatterPlot empty graphics window no longer generated

SummaryStats when number of unique values <= n.cat, properly treat variable as categorical

ttest missing data allowed for paired version

Changes for lessR version 3.3.6 (2015-11-05) <<<<<<

ANOVA output now constructed in segments for better knitr compatibility, > a <- ANOVA( ... ) > a # view all the output > names(a) # view the names of the segments > a$out_anova # for example, view the summary table knitr.file option for automatic construction of markup file from the various output segments improved formating of summary table graphics=FALSE option added

Read because of potential package dependency problems loading packages with the readxl package function for reading Excel files, went back to the gdata package for reading Excel files, which requires Perl, which requires a download for Windows computers, and which, unfortunately only reads the formatted data not the actual data, so first format the Excel data according to the General format before reading

ScatterPlot because package dependency problems loading packages with the car package, the ellipse option from that package is deactivated

Regression because package dependency problems loading packages with the car package, the scatter.3D option from that package is deactivated

Regression background, which listed variables in the model, sample size, etc., displays the intended information

Changes for lessR version 3.3.4 (2015-08-22) <<<<<<

Read, details better display format of variable labels and units

Regression knitr.file option has added display options code for displaying the code that generated the results and document for documenting the code knitr.file option extended to work with mydata <- rd(), that is browse for the file to read before doing the regression improved use of variable units in R Markdown from knitr.file option

Set can set the values for display options globally in the generated knitr.file generation, which includes results, explanation, interpretation and document

ANOVA missing terms from the sums of squares table included

Histogram, Density and BoxPlot variable names that are also function names are properly processed

Merge, Subset variable units preserved

Read specified format of data file remains regardless of file type

Changes for lessR version 3.3.3 (2015-07-23) <<<<<<

Read Hadley Wickam's read_excel function used for reading excel files, which does not require Perl, with the character variables from read_excel set as factors as with reading other data formats, except for the following addition for all formats ... Non-numeric character strings with unique values read as class character instead of class factor

Regression extensive further development of the generated markup file from the knitr.file option up to 6 predictor variables allowed for specifying new data, instead of just 5

Logit up to 6 predictor variables allowed for specifying new data, instead of just 5

corCFA knitr.file option option for lavaan style model specification, the same code runs both corCFA and lavaan min.cor and min.res options added for minimum respective value to be printed, to improve readability of correlation matrices output correlations omit the decimal point for more compact output correlations predicted from the model available in assigned output of function factor labels displayed on output correlation matrices

Label, Details print formatting of labels improved

Read SAS files are read

Write specify parameters in a more standard order: ref, data, format

corRead abbreviated form rd.cor properly recognized

Changes for lessR version 3.3.1 (2015-04-27) <<<<<<

BoxPlot, Density, Histogram output generated with named pieces such as for knitr, plus knitr.file option

Regression knitr.file much further developed including reproducing the full function call to the Regression function where the knitr.file is created, and now includes flags for output control: explain, interpret, results knitr information is now only written to a file, not to the output object displayed prediction intervals always contain the smallest interval and the largest interval PRESS R-squared included in the default output minimum number of decimal digits on output changed from 3 to 2, e.g., integer input leads to 2 decimal digits by default (override with digits.d) spacing of tabled output condensed

Histogram improved formatting of displayed frequency distribution

BoxPlot default for add.points option is overstrike instead of stack

corEFA rotate="none" option replaces show.initial option, now deleted

Nest the specification of the full model, the 3rd argument, can be all the variables in the full model or now just the added variables to the reduced model to define the full model

output control flags, in this order of presentation: explain, results, interpret On by default, but each can be set within each procedure that generates a knitr file as well as a global option, such as options(explain=FALSE)

library(lessR) added to knitr files from knitr.file option

SummaryStats outliers properly identified if smaller in value than 3

Correlation heat map for correlation matrix works more generally

Regression works with no predictor variables, e.g., reg(Y ~ 1) printed tables correctly display factor variables

SummaryStats if a by variable, now no output for stats in assigned object, instead of just for the last row

Changes for lessR version 3.3 (2015-03-19)} <<<<<<

knitr compatible

Regression, Histogram, SummaryStats output system redesigned so that now all output is formally returned when the corresponding function completes, back to the standard R way of doing things in pieces, but here each piece is enhanced with additional features

new function regPlot, which produces the Regression plots from the saved output of a previous regression run so that the plots can be interspersed throughout a knitr document

new function print.outall which, to add to knitr functionality, allows each of the pieces produced by Regression to be displayed individually, and is called implicitly by simply entering the name of the object

new function print.outpiece which, to add to knitr functionality, allows each of the paragraphs of output produced by Regression to be displayed individually, including in knitr, simply by entering the name of the saved piece, such as r$out_estimates if this was run: r <- Regression(Y ~ X1 + X2), and is called implicitly by simply entering the name of the object

Regression knitr.file option added to automatically generate knitr instructions which, when processed, result in an enhanced html, pdf or Word document that can be called interpretative output, statistical output plus commentary graphics=FALSE option added, mostly for use with the new regPlot explain=TRUE now generates the explanation in the knitr instruction file instead of the console, and is now the default so reg.explain was removed all new components for the saved object, now of class out_all from the analysis instead of the object of class lm defined by the R lm function, though many components are shared, also includes the knitr instructions scatterplot matrix with correlations in the upper triangle now default

Screen size of subsequent plots not changed after Help()

SummaryStats for a by analysis, levels with n=0 do not prevent analysis of all levels

Correlation for a selected subset of variables, the heat matrix is plotted if requested

Changes for lessR version 3.2 (2015-02-24)} <<<<<<

if more than one plot is created from a function call the name of each plot is displayed at the end of the console output

RStudio compatible, when in RStudio now graphics are managed as a sequential stream to the plot window

ttest line chart for a confidence interval of a mean displays (if requested) even if no hypothesized value for two groups, line.chart option works reliably for both groups

PieChart color gradient for ordered colors from an ordered factor extended to all color themes

col.ticks parameter no longer defined by lessR functions but passed directly as an R parameter, which avoids the warning messages

warning messages the causes of many warning messages, though benign, were identified and removed by reprogramming

Changes for lessR version 3.1.1 (2014-09-22)} <<<<<<

corReorder provides a new cor matrix if specified

BarChart for gray scale, bars a little lighter shade of gray

corEFA lavaan code from the EFA solution revised

label no argument, label(), displays all variable names and labels

PieChart passing standard R graphics parameters produces a square chart, so to avoid this issue the magnification factors cex and cex.main, for the labels and title, are explicitly defined

ScatterPlot default 1-dimensional scatter plot is method="overplot"

ttest graph for one group extends to large deviant values of mu0 from the data brief version includes the margin of error includes needed sample size for desired margin of error for 1 and 2 groups

Write for a csv write, create a second file of any variable labels

label if the specified variable does not exist, an error message is displayed

Changes for lessR version 3.1 (2014-02-24)} <<<<<<

default color theme changed from "blue" to "dodgerblue", which now has 0.25 default transparency for bar fill, if the previous "blue" is desired, then set with: set(colors="blue")

citations use of functions from other contributed packages cited in output

BarChart, BoxPlot, Histogram, ... can specify an entire data frame for analysis with the data parameter in addition to the variable parameter (x, usually listed first), e.g., hs(attitude) or hs(data=attitude)

Density, LineChart analysis of a data frame or list of multiple variables possible

BarChart invisibly returns the frequencies and proportions just as SummaryStats e.g., stats <- BarChart(Y), so stats contains this info pre-set transparency level of col.fill.bar applies to bar chart bars of a single variable

SummaryStats (and functions that call SummaryStats) outliers listed in two groups, those above the high box plot whisker, and those below the low box plot whisker, and if more than 25 then the intermediate values in a group are not listed more appropriate output when there is a frequency of zero on a by variable an explanatory note provided when computing row or column proportions that result in divide by 0, which displays as NaN for "not a number"

label can assign a variable label to a variable as well as list the label, so labels can be created/modified without reading from an external file

corCFA lavaan code for the default maximum likelihood solution with the lavaan function cfa generated for the specified measurement model content of items by scale listed in the sorted order by loading parameter added, labels="only" only lists the variable labels with no analysis for a content analysis only model solution invisibly returned that includes the estimated parameters and the scale reliabilities plus residuals improved formatting of column displays

corEFA lavaan code generated for measurement model suggested by the EFA solution

simCLT triangle package needed for antinormal distribution has been updated to R 3.0, so antinormal distribution restored (antinormal distribution has no values in the middle and most values at the extremes)

Merge, Recode, Subset, Transform variable labels that exist in the input data frame(s) are retained in the transformed data frame

BarChart, SummaryStats for more than 10 categories the proportions are correctly computed

Correlation, ScatterPlot correct variable labels listed

ScatterPlot method parameter used for purpose other than specifying spearman or kendall for correlation type, of which use is now flagged

corScree specified correlation matrix analyzed instead of just one named mycor

corEFA 1 factor solution completes

Changes for lessR version 3.0 (2014-01-02)} <<<<<<

Subset new parameter: random Specifies the number or proportion of data rows to retain, which replaces the dual use of the rows parameter to both perform this task and provide a direct specification of the rows of the data table to be included/excluded, so now the following work: mydata <- Subset(c(1,4)) # retain only rows 1 and 4 mydata <- Subset(-c(1,4)) # delete only rows 1 and 4

ttest new parameter: line.chart When set to TRUE, adds a line chart of the response variable for each group in the analysis aesthetics of the density curve output updated

ANOVA condition that lead to a warning for the means plot fixed

Correlation in the output, the correlation matrix object was always described as mycor regardless of the actual assigned name, this line of output is now deleted

Density missing data with specified bins now works

ttest density plot in gray scale if colors="gray.black"

Changes for lessR version 2.9.7 (2013-10-29)} <<<<<<

details add a brief version, details.brief, which only lists the table of variable names and any variable labels

ScatterPlot for 1-D scatter plot, to conform to standard R, rename option plot.method to method

Read relying upon the read.xls function from the gdata package, can read Excel files identified by the .xls or .xlsx filetype, both the data file and/or the labels file can be Excel files add a brief version, rd.brief, which calls the new details.brief provide an option to browse for the labels file, labels="" always display the full path of the data file and any label file

BarChart if a data frame analyzed, then a categorical variable with only a single value would cause a fatal error, now the remaining variables are analyzed and a diagnostic displayed instead

Read labels files for Windows now properly specified

ScatterPlot for a plot with a categorical x-axis, additional parameters such as ylim now work correctly

corScree graph of "successive differences of eigenvalues" now properly labeled

Changes for lessR version 2.9.4 (2013-08-25)} <<<<<<

BarChart return the table of frequencies, so can assign to an object

BoxPlot subset of variables can be specified, e.g., bx(c(x,y,z))

CountAll parameters may now be added, such as bin.start for Histograms

Density test for normality done only if a normal curve is plotted

Histogram bin.end parameter added subset of variables can be specified, e.g., hs(c(x,y,z))

ttest to accommodate density plot of more data sets, bandwidth default changed from nrd to bcv two vector form of two-group t-test now accommodated from a data frame to permit a dependent-groups analysis from a data frame for a dependent-groups analysis, or paired t-test, a scatter plot of the two variables is produced with a diagonal line through the plot to indicate equality and the vertical distance from the line to each point displayed to indicate the extent of the change

ttestPower value of n on graph displayed as an integer or with decimal digits as appropriate

ScatterPlot for a scatter plot of two numeric variables, diag=TRUE places a diagonal line through the plot with vertical lines from each point to the diagonal, primarily for plotting change in a dependent samples t-test removed x.start, x.end, y.start and y.end for bubble plots: use xlim, ylim

SummaryStats returns summary statistics for analysis of a single variable subset of variables can be specified, e.g., ss(c(x,y,z)) analysis of a data frame yields the default value of brief, which can be overridden in the function call

BarChart graceful termination if a bar chart is attempted with only 1 unique value

Density col.fill.nrm, normal curve fill, can set to transparent for blue color theme

Regression density lines in residuals density plot now appropriate color for black backgrounds

ScatterPlot xlim and ylim also applies to bubble plots, before they were ignored line plot by default even when intervals of successive values of a sorted x are only equal to within 9 decimal digits 1-D plot displays outliers with same plot.method as regular points

Correlation name of first variable in bivariate correlation now displays correctly

Changes for lessR version 2.9.3 (2013-05-26)} <<<<<<

Note: The Excel read functionality added in 2.9.2 is removed because it required Java, and this additional installation was adding too much complexity for users. To retain this functionality, do the following.

install.packages("xlsx") # one time only library(xlsx) # for each R session to invoke the following mydata <- read.xlsx(file.choose(), sheetIndex=1)

```
This provides for a direct read of an Excel file by browsing for the file.
To specify a specific path name or URL, replace file.choose() with the
correct name in quotes.
The only lost functionality if the above code is implemented is that variable
labels cannot be read with an Excel file. To provide for these labels first
save the Excel file as a csv file.
```

LineChart a "zero" option is provided for center.line to pass the line through 0

simCLT the "antinormal" option is inactivated until the supporting triangle package is updated

BarChart does not terminate when a table is specified as input

Changes for lessR version 2.9.2 (2013-05-08)} <<<<<<

Read tab-delimited text data file detected by default in addition to csv text data file Excel files now read and detected by default, including variable labels

Density for colors with a black background, density functions plotted with light colors

corRead abbreviation rad.cor no longer available, use rd.cor

Histogram rounding error in the computation of cumulative probabilities fixed

Changes for lessR version 2.9 (2013-03-11)} <<<<<<

Nest compare a nested model to a full model with least-squares or logit fit

details obtain the details of a data frame, such as called from Read

a variable to be analyzed from the user's workspace is so noted

Read2 renamed from rd2

Regression can return an object of class lm

Logit classification table added if only some forecasts shown, the middle range is for fitted values close to the threshold of 0.5 collinearity analysis added for multiple predictor variables can return an object of class glm

ANOVA ANOVA tables now cleanly formatted residuals displayed as in Regression, by default first 20 sorted res.rows and res.sort options added, as in Regression

Histogram can return an object of class histogram

Density can return an object of class density

BoxPlot allow R graphics parameters to be passed, such as whiskcol, see ?bxp colors adjusted for gray and gray.black can return an object with standard boxplot components

SummaryStats if integers in input data then output to 2 decimal digits if more than 50 outliers, then just first and last 25 are displayed

set color white added quiet option now can be set, e.g., set(quiet=TRUE) brief option now can be set, e.g., set(brief=TRUE)

ANOVA brief form works correctly

Logit plot of fitted values and scatter plot produced consistently

BoxPlot numerical values on the correct axis for vertical and horiz orientation

Histogram situation in which largest value exceeded the largest bin fixed

set transparency properly initialized for default blue

Changes for lessR version 2.8 (2013-02-01)} <<<<<<

The keepers of CRAN have changed the rules. They no longer allow a function to automatically direct output to a data table. They have decided that users should always explicitly specify the destination of the output file.

What that means for lessR is that any function that outputs a data table now cannot automatically write that data table to mydata or other chosen name. Instead you must now explicitly assign the output data table name when reading or modifying data, usually use mydata or mycor. To do this, use the R assign notation, <- , which assigns anything on the right side to whatever is on the left side of the expression. > mydata <- Read() > mydata <- Transform(Y=X/12) also Subset, Merge, Recode, Sort > mycor <- Correlation() If you do not make this explicit assignment, the function still works, but the output is dumped at the console instead of sent to a stored data frame such as mydata to be available for later analysis.

Other generic changes:

To be consistent with R functions, the data frame option has been changed from dframe to data. Usually this is not used as the default mydata is relied upon, but now specify other names with the data option.

Previously the brief option was used inconsistently. For some functions it lead to a brief output, and for others it suppressed output. It remains for functions that primarily send output to the console. For graphics functions and data modification functions, now completely suppress output with quiet=TRUE.

Merge merges two data frames either horizontally or vertically

default system setting n.cat, the maximum number of unique values of a variable to be treated as a categorical variable by default, changed to 0, turned off by default

Recode, Sort, Subset, Transform now precede the function call with, for example: mydata <-

Read now precede the function call with, for example: mydata <- variable labels now incorporated directly into the data frame and are now read with the labels option rd is the abbreviation, though the older rad is still available lessR.data option re-specified as format="lessR" quiet option replaces brief

Subset holdout sample can no longer be created from with the function given the rule change from CRAN, but holdout=TRUE creates the code to copy and paste back into R to create the holdout sample

ttest for two group analysis from a formula, the separate data vectors are returned for later analysis (see the examples) graph for two group analysis now in gray scale for colors="gray" when input is summary stats, reported summary stats are to the same level of precision as to what was input variable label, if present, appears on density graph standard R alternative option available for one-tailed tests paired=TRUE option available for dependent-groups t-test

ttestPower powercurve.t.test name removed in favor of ttestPower

ANOVA randomized blocks analysis displays the marginal and grand means two-way between groups analysis displays the cells size once instead of the same number for all cells

Regression standardization option available rgl package bug apparently fixed, so scatter.3d=TRUE is again available for models with two predictor variables singularity check added and solution terminated if so residuals vs fitted Values plot plotted with current color theme scatterplot of prediction intervals with current color theme

Correlation can use method="kendall" and method="spearman" graphics=TRUE to create a scatter plot matrix and heat map pdf=TRUE to create and write scatter plot matrix and heat map to pdf files

Density summary statistics reported by default quiet=FALSE replaces text.out=TRUE

Histogram quiet=FALSE replaces text.out=TRUE

ScatterPlot one variable, method="jitter" option from R stripchart possible missing data removed to enable ellipse from car package the by variable need no longer be a factor fit.line applies to each level of a by variable for a by variable, width of plot adjusted for legend when saving to a pdf quiet=FALSE replaces text.out=TRUE

BoxPlot color of the box more vivid quiet=FALSE replaces text.out=TRUE

Logistic abbreviation is lr instead of older lgt

set default colors="blue" bar fill is lightsteelblue3 from lightsteelblue colors="sienna" and "gray.black" color themes added colors="dodgerblue" given a light gray background

BarChart reports the corresponding chi-square test count.names option name changed to the more meaningful count.levels quiet=FALSE replaces text.out=TRUE

PieChart for consistency with other functions, col.pieces changed to col.fill, the specified color of the regions of the pie chart quiet=FALSE replaces text.out=TRUE

SummaryStats only report summary statistics (chi-square test moved to BarChart)

ANOVA pdf=TRUE properly writes the graphs to the working directory

ttest graph for two group analysis shows the degrees of freedom in the title if missing a grouping variable data value, analysis still proceeds

BarChart ordered progressions of color with purple, sienna and dodgerblue work

Histogram col.ticks warning addressed and no longer generated text.out can now be set to FALSE

Density a perfectly symmetrical distribution properly plots as a density function

LineChart default area under the plotted line segments now fills to proper color according to the current color theme

ScatterPlot transparency for one and two variable plots correctly provided by default ellipse works correctly for non-regular plots such as bubble plots for kind option, bubble and sunflower can be now specified as documented sunflower plot has background and grid colors according to color theme show.n=TRUE works correctly for pairwise deletion for correlation matrix

Correlation properly accept variables in global environment

Regression prints residuals and forecasting errors when there is a factor predictor for categorical variables the results for all levels are displayed

set n.cat no longer set to 4 when set function called

Write if suffix .csv or .rda already exists, not added again to file name

Changes for lessR version 2.6 (2012-10-24)} <<<<<<

Graphics procedures Color themes were enhanced and the terminology for modifying individual colors in a specific graph or system-wide with the set function was standardized. 'fill' refers to the color of an interior region, either of a bar or a circle. 'stroke' refers to a line or outline, such as the border of a histogram bar or a plotted point. Also, to change a color theme is now only available with function set, as are references to transparent colors with trans.fill.bar and trans.fill.pt.

set added an orange color theme, which has a black background instead of the usual light background, also added dodgerblue and purple revised green color theme added option ghost to provide transparent bars against a black background with no grid lines, which works well with colors such as orange and red colors option was getting too complex and was cluttering the options lists for the graphics functions, so color theme and transparency only available from the set function, but with more extensive options

Read default text output to console that describes the data is redesigned new parameter lessRdata allows direct reading of built-in data sets can read directly from built-in data sets with lessR.data= option

Write write any specified data frame, not just the default mydata specify any file name or rely upon the default by default write row IDs as part of the written csv file the dframe option moved to the end of the parameter list to be consistent with Read

Subset abbreviation locate added to emphasize locating cases without creating a new data frame, where save.dframe is automatically set to FALSE can subset on row.names criterion for selecting rows, rows, can be an integer or proportion, to indicate the number of rows to randomly extract and also to create a hold out sample

Recode just data for the variables to be recoded are shown before the recode, and just the recoded and any new variables shown after the recode a recode is not allowed to be applied to a factor because doing so converts the factor to character strings (use factor function instead)

Transform transformed data is shown only for variables that have been transformed

Sort default saves the sorted data frame written over the input data frame without needing to explicitly assign the result to a data frame the keep=FALSE option allows the sorted data frame to be written to another data frame with the R assignment statement random option added to randomly shuffle the rows of data

Help
the argument for a specific help page no longer needs to be enclosed
in quotes

capitalization of the argument for a specific help page is irrelevant

ANOVA randomized block design supported (in addition to one and two factor between groups designs) fitted plot and data plotted for randomized block design residuals provided effect sizes provided graphs may be saved to pdf files with pdf=TRUE one-way cell mean plot works with current color theme HSD analysis for two-way models, between groups and randomized blocks marginal means provided for two-way models

Regression scatterplot matrix incorporates the color theme display of prediction intervals includes interval width decimal digits uniformly applied across the text output

Logit scatterplot matrix added when there are multiple predictor variables

ttest option for saving the graphic of the two density curves to a pdf, consistent with other lessR functions for graphics add a show.title option to suppress the title over the graph Cohen's d effect size index added to one-group t-test density plot with Cohen's d, mean and hypothesized mean

ScatterPlot for one variable, dot plot, gray scale outliers displayed in squares and diamonds, for potential and actual outliers, respectively

Histogram trans.bars option available, analogous to trans.pts for scatter plots

LineChart option col.border added to specify the border color of the filled polygon under the plotted lines, including the value of "transparent"

BoxPlot dotplot option changed to add.points to be consistent with the call to the ScatterPlot function for one variable, i.e., a 1-D scatter plot

corCFA sum of squares and average residual for each item and total available the number of default iterations for communality estimates increased from 15 to 25 an abbreviation called scales added to retain 1's in the diagonal for component analysis, that is, the observed scale scores

corEFA items by default sorted by their highest factor loading, with an option provided to not do this min.load option changed to min.loading and this applies to the output of the EFA as well as the constructed model for the CFA n.fact argument changed to n.factors

data files the name of each included data file begins with "data" instead of "dat"

prob.tcut renamed from qnt.t, t-cutoff probability function with t and normal curves

Subset, Transform, Sort when dframe not saved because save.dframe set to FALSE, dframe properly is assigned to a new data frame via an assignment statement

PieChart colors option works correctly

BarChart changes to background color and grid with colors="gray" work correctly frequency table displayed when prop=TRUE

LineChart color theme now applied to fill color under plotted polygon

ttest standard deviation on graph for second group reported correctly

Changes for lessR version 2.5 (2012-08-09)} <<<<<<

Sort sorts the rows of a data frame by the values of specified variables for both numeric variables as well as factors

Transform a modified version of the standard R transform function, but by default saves the revised data frame to the input data frame and provides feedback and information regarding the transformation(s)

Subset a modified version of the standard R subset function, but by default saves the revised data frame to the input data frame and provides feedback and information regarding the changes to the data frame

Logit logit analysis, a wrapper for the standard R glm function with family=binomial plus related functions such as summary and predict

corCFA, corReflect, corReorder variables are now specified by their names instead of by their ordinal position in the correlation matrix

corEFA to match the change in specifying variables in corCFA, the derived confirmatory model is now written in terms of variable names

corScree on the graph of the differences of successive eigenvalues, a horizontal line is drawn to better highlight the "scree"

Correlation now can provide a list of variables from the input data frame instead of having to first separately create the subset data frame now non-numeric variables are now automatically deleted from a submitted data frame or variable list with the analysis proceeding

Recode a list of variables instead of just one variable may now be recoded missing data entries may now be recoded to valid values specified valid values may now be recoded to missing values

ScatterPlot original function Plot, abbreviated plt, was based on R function plot, which did a scatter plot of two variables and also also did a line chart for one variable, now only 1 and 2 dimensional scatter plots are done, so function renamed accordingly, where a 1-D scatter plot is a dot plot, though Plot is still available as a name

Help each help page can now be invoked with a variety of key words, which usually include the full and abbreviated names of each function described on that help page

SummaryStats now recognizes n.cat to treat numeric variables as categorical if the number of unique values is less than or equal to n.cat

corCFA when default sort option on, sometimes items were not sorted properly

set the transparency level of plotted points in ScatterPlot now works

Changes for lessR version 2.4 (2012-07-21)} <<<<<<

Recode recode individual values of an integer or factor variable

The following new functions work with a correlation matrix, named mycor by default, instead of the data matrix from which the correlations are computed. Each function that outputs correlations also generates a heat map of the output matrix.

corCFA confirmatory factor analysis and item analysis for multiple indicator measurement models from an input correlation matrix

corEFA exploratory factor analysis based on R factanal function, though also provides for a multiple indicator measurement model based on the exploratory analysis and the corCFA code for which to analyze the model

corList list the ordinal position of each variable in the input correlation matrix to facilitate using the other correlational routines

corProp calculate proportionality coefficients from an input correlation matrix, used to identify items that are indicators of the same factor

corRead read an input correlation, or other square, matrix

corReflect reflect specified variables in an input correlation matrix

corReorder re-order the specified variables in the input correlation matrix

corScree eigenvalue plot and plot of differences of successive eigenvalues to help determine the number of factors

System wide Variable labels when applied to axis labels on a graph are now truncated to 50 characters for y-axis and 45 characters for x-axis to fit All graphic files can now be saved from the call to the graphic function as preceding the function call with an R pdf statement does not work due to the customized graphics system that allows the Help window to persist across analyses Cutoff value to interpret a numeric variable as categorical now called n.cat instead of n.cut, and is implemented system wide with the set function

Correlation minimum default number of digits in output correlation matrix is 2 computed correlation matrix automatically written to mycor missing data choices made explicit with parameter miss, pairwise is default cell-wise sample size reported for pairwise deletion effective sample size for all cells reported for listwise deletion heat map added when a correlation matrix is computed

Plot provide for a by variable, a grouping variable, for which the points are plotted in a different color and/or shape for each value of by

SummaryStats IQR added to full version

BarChart legend placement option for right margin added to the standard R legend locations and is now the default, legend.loc="right.margin" legend in right margin accommodates variable labels for displayed cross-tabulation table, variable names instead of variable labels used for count.names option, no longer needed to place the data frame name and a $ in front of the specified variable name text.out option added so can be set to FALSE when applied to a data frame, individual graphs written to individual files

Histogram text.out option added so can be set to FALSE when applied to a data frame, individual graphs written to individual files

BoxPlot text.out option added so can be set to FALSE when applied to a data frame, individual graphs written to individual files

ttest graph option added, if FALSE then no graph is produced for two groups

LineChart changed name from RunChart to better reflect its more general meaning

Deprecated function names removed color.barchart, color.boxplot, color.density, color.hist

Deprecated function names renamed sim.CLT to simCLT, sim.CImean to simCImean, sim.flips to simFlips, sim.means to simMeans

Summary Statistics labels for the two variables in a cross-tab no longer switched

to remove debug print statement

ttest extra null graphic window no longer generated for two groups analysis

BarChart left margin on horizontal bar chart was sometimes too large color theme for a single variable now properly displays

Changes for lessR version 2.3 (2012-06-10)} <<<<<<

All data analysis functions now have two names, a longer, more descriptive name, that involves uppercase letters, such as Summary Stats, and a short abbreviation, here ss. Either version is equivalent. The purpose of the uppercase letters is to distinguish lessR functions from the standard R functions with similar names. When appropriate, functions can also have an abbreviation such as brief to indicate a briefer form of output, here ss.brief.

Using the new set function, the colors options sets the system wide color theme. The default is "blue" and several other colors are available, including gray scale with "gray". The colors option may also applied to any one specific graphic function call to set the color theme just for that one resulting graph.

Transparency for plotting individual points with Plot is also available with the trans.pts option, from 0 to 1, with a 0 being opaque and a 1 being fully transparent. The trans.pts option may be set with the set function for all subsequent analyses, or it may be set for any one specific call to Plot.

lessR defaults the name of the data frame with data read from the Read or rad function with mydata. Now this convention is leveraged by dropping the need for the R attach function, or by having to include the data frame name and a $ in front of the variable name, or using the with function. Instead, for each specified variable name, lessR searches the user's workspace, the global environment, as well as the data frame mydata, or the specified data frame name, for the relevant variable.

lessR functions now can access variable labels, which will replace the variable names on the axis labels for graphic output, and be displayed adjacent to the variable names on text output. Use the labels function to access the variable labels for standard R functions. See help(Read) for directions on how to enter the variable labels.

In addition to the new color themes, the appearance of the graphs has been changed to print the values along each scale of an axis is a smaller font and a shade of gray instead of black.

Model, model Function for a linear analysis, which automatically calls the relevant function -- ttest, ANOVA or Regression -- and therefore replaces those functions from the user's perspective.

set Created as a wrapper for options(colors="xxx") added for graphics routines to specify a color theme.

to Created to generate variable name lists with sequential numbers when reading data into R such as from a csv data file.

Help, hlp updated help window opens when lessR is loaded

Read, rad Add option for reading native R data files SuppressWarnings to avoid warning message on read.csv for Excel csv files with no last SuppressWarnings on SPSS files to avoid "Unrecognized record type 7"

Write, wrt Add option for writing native R data files Automatically add a file type, either .csv or .rda

ttest, tt Also do analysis of not assuming equal variances. Set extra decimal digit for analysis from stats (already set from data) By default, at least two decimal digits Added consistent formatting to numerical output according to digits.d Two-group density graph has density scaling removed from y-axis Two-group density graph has smaller font sizes and margin adjustment

BarChart, bc Brief stat output now the default, bc.brief removed Add error condition of not having col.bars and colors both activated Add colors="gray" option and bc.gray Display tick marks and tick labels in dark gray

Histogram, hst Add vertical grid in addition to existing horizontal grid Add colors="gray" option and hst.gray Display tick marks and tick labels in dark gray

DotPlot, dp Add colors="gray" option and dots.gray Display tick marks and tick labels in dark gray

Density, dens Remove horizontal grid, leaving no grid Add colors="gray" option and dens.gray Display tick marks and tick labels in dark gray

Plot, plt Add colors="gray" option and plt.gray expression Display tick marks and tick labels in dark gray Add ncut to treat x as a factor if too few unique data values When x is a factor, now do summary stats of y by each level of x Run chart now lists n and n.missing Scatterplot matrix and correlation matrix added for a data frame

Regression, reg 3D scatterplot optional for two predictor variables instead of required The colors setting applies to reg graphs

BoxPlot, bx Consistent formatting of text output with default decimal digits

Correlation, cr Correlation routine pulled from plt and made its own function The version cr.brief added Correlation matrix of a data frame added

RunChart, rc old access was Plot with one variable, which now produces a dot plot

prob.norm now returns the probability in the console like R pnorm function

qnt.t renamed from prob.t as it is the quantile that is returned now returns the quantile in the console like R qt function

system wide a variable named with the name of an R function is now permissible

ttest, tt Analysis from summary statistics does not need a mydata data frame to exist prior to the function call Variable labels no longer switched on response and grouping variables Labels now work with one group analysis Properly align group1 and group2 output to the user's workspace with the correct group

Plot, plt Run plot failed if missing data Plot with factor on x-axis failed plot of means if missing data If missing data, do not try ellipse which fails For bubble plot, x-axis now has proper scale

Summary Statistics, ss The null graphic window no longer opened

BoxPlot, bx When doing a dot plot, colors did not transfer from bx

Changes for lessR version 2.2 (2012-03-28)} <<<<<<

sim.CLT Simulation for Central Limit Theorem

sim.flips Simulation of coin flips

sim.CImean Simulation of confidence interval

sim.means Simulation of repeated sampling of means

prob.t Probabilities of t-distribution

Note: The following long form names are not valid as function calls until Version 2.3.

graphics routines Font for scale values along each axis smaller and in a dark gray

ttest, tt Rewrite to allow one or two groups, data or summary stats Allow missing data Restore graphic parameters so top margin of graphics window not too large tt.brief option added

BarChart, bc Count.names option for reading counts directly from a data file with counts Smaller font for legend and no legend border to better display Smaller font for axis values For horizontal plots, horizontal labels and accommodate space in left margin bc.brief added, brief=FALSE is new default for bc

Histogram, hst Smaller font for scale

Plot, plt Smaller font for legend and no legend border for 2 variables to better display Provide covariance coefficient

Summary Statistics, ss ss.brief added

Correlation, cr Provide covariance coefficient

Density, dens [now den in 2.3] Provide densities on vertical axis as an option instead of a requirement

Read, rad Read spss (.sav) files in addition to csv data files For rad.both, have labels display correctly

prob.norm [normal curve probabilities] Only give normal densities on vertical axis as an option Add second x axis, z-scores Make vertical, density, axis as an option Scale axis labels to .9, add mag option

prob.znorm [normal curve display with z-scores] Default y-axis to null, add as an option Add z-values as a default Scale x-axis according to standard deviations Scale axis labels to .9, add mag (magnify) option

stats.t.test Removed, incorporated into ttest which now processes data from summary statistics or the data

Regression, reg If residual is 0, Nan's lead to missing data no longer causes Cook's distance function, and therefore the entire function, to fail

Changes for lessR version 2.1 (2012-02-08)} <<<<<<

dots [in 2.3 rename to DotPlot, dp] created the function

pieplot [in 2.3 rename to PieChart, pc] created the function

package add citation all lessR functions that read data have attach requirement removed relevant lessR functions have automatic use of variable labels variable labels function label

rad [in 2.3 also named Read] display name of file read at the beginning of the output only if rad() default is now to not attach mydata add read labels options with rad.labels and rad.both add max.lines options and display full data/labels when applicable convert display option to brief option, add function

reg [in 2.3 also named Regression] add error check for no data frame, which is required for Background, specify number of obs retained for analysis add references add variable labels to Background section where variables are listed add reg.brief and reg.explain methods reformat Basic Analysis output to print all values individually if a non-numeric variable in model, then do not attempt scatterplot matrix if a non-numeric variable in model, then no scatter.3d plot

plt [in 2.3 also named Plot] in title, use actual variable names instead of "x and y" loess fit line replaces lowess, along with access to loess span parameter use dates from an existing time series add missing data count put error traps for calling with the wrong data types

barchart [in 2.3 rename to BarChart, bc] reformat output to much more compact for UseMethod, evaluate class of 1st attribute only to avoid a warning add a y-axis label get border option to work and change name to col.border properly switch axis labels if horiz=TRUE add warning message for beside option off with only a single variable add warning message if addtop set for a horizontal bar graph make chisq the default, and reformat output put the variable names on the tables for row and column proportions re-scale bar width for 2, 3 or 4 bars, from 1 var or stacked 2 var get vivid option to work for gradient applied to ordinal data for vertical graphs of two vars, make legend horizontal with addtop room for data frame, numeric data types of few unique values treated as categorical for data frame, add dev.off() when finished with graphs enhanced the color palettes with R palettes of rainbow, heat and terrain make stacked chart for two variables the default

smooth [in 2.3 rename to Density, den] add bw parameter make no density axis the default, but add y.axis to include if desired add normality test allow for missing data change col.hist to col.bars get color of the plotted curves working

histogram [in 2.3 rename to Histogram, hst] move Number of Bins output next to freq dist, add Bin Width summary statistics and label now with describe.numeric scientific notation turned off for histogram plot col and border options renamed to col.bars and col.border summary stats now provided for data frame, numeric data types of few unique values treated as categorical for error message regarding bin range, turn off scientific notation

boxp [in 2.3 rename to BoxPlot, bp] add values of outliers to text output adjust axis labels for vertical box plot provide for default colored background and grid lines

describe [in 2.3 rename to SummaryStats, ss] formula input changed to by= option output reformatted to much more compact, and extended outlier detection added to description of numerical variables if too many values, then just report counts if all values unique, just report the values and a note as an ID field for numeric, if digits.d > 10, output size changed to 4 with prompt to override for data frame, numeric data types of few unique values treated as categorical add skewness, kurtosis add a brief=TRUE option, which works for both numeric and categorical variables

smd.t.test [in 2.3 rename ttest,tt] add brief option and function

stats.t.test [in 2.3 incorporated into ttest] add one-sample option by default resolve number of digits from precision of entered stats

help.me [in 2.3 now named Help, hlp] update color.hist [in 2.3 Histogram, hst] description

plt [in 2.3 also named Plot] missing data caused an error in bubble plot

reg [in 2.3 also named Regression] allow missing data