Quickly and easily perform exploratory data analysis by uploading your data as a 'csv' file. Start generating insights using 'ggplot2' plots and 'table1' tables with descriptive stats, all using an easy-to-use point and click 'Shiny' interface.
# Install from CRAN: install.packages("ggquickeda") # Or the development version from GitHub: # install.packages("devtools") devtools::install_github("smouksassi/ggquickeda")
To launch the application, use
run_ggquickeda() then navigate to your csv file (or
run_ggquickeda(data) to launch the app with a specific dataset already loaded).
R Shiny app/package as a handy interface to ggplot2/table1. It enables you to quickly explore your data to detect trends on the fly. You can do scatter plots, dotplots, boxplots, barplots, histograms, densities and summary statistics of multiple variable(s) by column(s) splits. For a quick overview using an older version of the app head to this Youtube Tutorial .
Export Plots and Plot Code tabs contributed by Dean Attali. Once a plot is saved in the X/Y Plot tab by providing a name and hitting the Save plot star button it will become available for exporting. You can export in portrait, landscape and multiple plots per page.
Plot Code will let you look at the source code that generated the plot with the various options. This is helpful to get you to know ggplot2 code.
Quick summary statistics tables using Benjamin Rich table1 package.
The best way to learn is to load a data your are familiar with and start experimenting. Try to reproduce the steps below using the included sample_df.csv. This will give you an idea on the kind of ouputs that can be generated.
The package has also two vignettes.
Here is an overview of some of the things that can be done with the various menus:
Choose csv file to upload or use sample data This execute the code to load your csv file or the internal sample_data.csv:
read.csv("youruploadeddata.csv",na.strings = c("NA","."))
Once your data is uploaded the first column will be selected for the y variable(s): and the second column for the x variable:, respectively. A simple scatter plot of y versus x variables is shown. ggquickeda can handle one or more y variable(s) selections but only one x variable. Note that the x variable should be different from those selected for y variable(s). Whether the user selects one or more y variable(s), the y variable(s) data will be automatically stacked (gathered) into two columns named yvalues (values) and yvars (identifier from which variable the value is coming from) and a scatter plot of yvalues versus x, faceted plot by yvars will be shown. Mixing categorical and continuous variables will render all yvalues to be treated as character. The order of the selected y variables(s) matters and can be changed via drag and drop. Selections can be removed by clicking on the small x. When no y variable(s) is selected a histogram (if x variable is continuous) or a barplot (if x variable is categorical) is shown.
After selecting your y variable(s) if any and x variable you can directly proceed into data manipulation within the Inputs tab using the following subtabs. Note that the subtabs execution is sequential i.e. each subtab actions are executed in the order they appear. If the user changes an upstream action this will reset the subsequent ones.
Recode/Reorder Categories This subtab is dynamic in the sense that the user can add/remove variables. Once a non-numeric variable is selected another field with the current variable levels will be generated. The user can reorder the levels using drag and drop and/or edit a level by hitting Backspace and typing in a new character string. Note that the order chosen here might not be reflected on the yvalues a separate subtab after stacking is provided for this Reorder Facets or axis Levels
Combine Two Variables This enables the user to select two categorical variables Var1 with levels(V1L1,V1L2) and Var2 with levels(V2L1,V2L2) to generate a new variable named Var1_Var2 with levels V1L1_V2L1, V1L1_V2L2, V1L2_V2L1, V1L2_V2L2 and so on.
Filters Up to six sequential filters, three for any type of variable Filter variable (1),Filter variable (2) or Filter variable (3) and three for continuous variables Filter continuous (1), Filter continuous (2) or Filter continuous (3).
One Row by ID(s) Filter the data down to distinct values (one row) of the selected variable(s) which are usually identifiers for subjects, occasions, arms etc. In long data format several variable that are time invariant are repeated this helps in removing the repetitions. User might want the first row of each subject or the first row of each subject/occasion combination etc.
Simple Rounding Rounding a numerical variable to a specified number of digits. It can help to come up with a crude binning.
Reorder Facets or axis Levels Enables the user to reorder the yvalues using a statistical function (Median, Mean, Minimum or Maximum of another variable) with a checkbox to quickly reverse the order, if desired. The user can also manually drag and drop an order and change the name of the levels where \n is recognized as a line break.
Various options to tweak the plot:
A shorter version of this walk-through within the app.
Main plot is output here with the various options to generate the plot below the possibilities include:
ggplot2built-in functionality for Group, color, size, fill mappings as well as up to two variable for column and row splits (faceting).
Installing the package should handle the installation of all dependencies. There are listed here in case you are curious:
The app can also be directly launched using this command
shiny::runGitHub('ggquickeda', 'smouksassi', subdir = 'inst/shinyapp')
Initial CRAN release