The R package 'ashr' implements an Empirical Bayes
approach for large-scale hypothesis testing and false discovery
rate (FDR) estimation based on the methods proposed in
M. Stephens, 2016, "False discovery rates: a new deal",
This repository contains an R package for performing "Adaptive Shrinkage."
To install the ashr package first you need to install devtools:
The main function in the ashr package is
ash. To get minimal help:
The ashr ("Adaptive SHrinkage") package aims to provide simple, generic, and flexible methods to derive "shrinkage-based" estimates and credible intervals for unknown quantities $\beta=(\beta_1,\dots,\beta_J)$, given only estimates of those quantities ($\hat\beta=(\hat\beta_1,\dots, \hat\beta_J)$) and their corresponding estimated standard errors ($s=(s_1,\dots,s_J)$).
The "adaptive" nature of the shrinkage is two-fold. First, the appropriate amount of shrinkage is determined from the data, rather than being pre-specified. Second, the amount of shrinkage undergone by each $\hat\beta_j$ will depend on the standard error $s_j$: measurements with high standard error will undergo more shrinkage than measurements with low standard error.
The methods are based on treating the vectors $\hat\beta$ and $s$ as "observed data", and then performing inference for $\beta$ from these observed data, using a standard hierarchical modelling framework to combine information across $j=1,\dots,J$.
Specifically, we assume that the true $\beta_j$ values are independent
and identically distributed from some unimodal distribution $g$. By
default we assume $g$ is unimodal about zero and symmetric. You can
specify or estimate a different mode using the
mode parameter. You
can allow for asymmetric $g$ by specifying
Then, we assume that the observations $\hat\beta_j \sim
N(\beta_j,s_j)$, or alternatively the normal assumption can be
replaced by a $t$ distribution by specifying
df, the number of
degrees of freedom used to estimate $s_j$. Actually this is
important: do be sure to specify
df if you can.
There are a few major changes in output and input that will likely break existing dependencies. Here are the highlights
the main output (lfsr, lfdr, etc) is rearranged into a dataframe,
so, for example, the lfsr is now
a$result$lfsr instead of
or, better, use the accessor function
get_lfsr(a) to extract
the lfsr etc
I added accessor functions
get_psd etc to access the lfsr, lfdr, posterior mean and posterior
standard deviation. Using these functions to access results will
help ensure your code remains valid if I happen to change the
internal structure of the results again (although not
fitted.g is renamed
flash_data to make the whole package convention more
compdens replaced with
comp_dens, again for consistency.
nonzeromode option is replaced with the option
mode to specify
mode. Or use
mode="estimate" to estimate the mode.
more flexible control of output. For example, you can say you want
only the logLR output by specifying
outputlevel = c("lfsr","logLR"), or only posterior mean by
outputlevel = c("PosteriorMean").