Flexible Phenotype Simulation from Different Genetic and Noise Models

Simulation is a critical part of method development and assessment in quantitative genetics. 'PhenotypeSimulator' allows for the flexible simulation of phenotypes under different models, including genetic variant and infinitesimal genetic effects (reflecting population structure) as well as non-genetic covariate effects, observational noise and additional correlation effects. The different phenotype components are combined into a final phenotype while controlling for the proportion of variance explained by each of the components. For each effect component, the number of variables, their distribution and the design of their effect across traits can be customised. For the simulation of the genetic effects, external genotype data from a number of standard software ('plink', 'hapgen2'/ 'impute2', 'genome', 'bimbam', simple text files) can be imported. The final simulated phenotypes and its components can be automatically saved into .rds or .csv files. In addition, they can be saved in formats compatible with commonly used genetic association software ('gemma', 'bimbam', 'plink', 'snptest', 'LiMMBo').


PhenotypeSimulator

PhenotypeSimulator allows for the flexible simulation of phenotypes from different genetic and non-genetic (noise) components.

In quantitative genetics, genotype to phenotype mapping is commonly realised by fitting a linear model to the genotype as the explanatory variable and the phenotype as the response variable. Other explanatory variable such as additional sample measures (e.g. age, height, weight) or batch effects can also be included. For linear mixed models, in addition to the fixed effects of the genotype and the covariates, different random effect components can be included, accounting for population structure in the study cohort or environmental effects. The application of linear and linear mixed models in quantitative genetics ranges from genetic studies in model organism such as yeast and Arabidopsis thaliana to human molecular, morphological or imaging derived traits. Developing new methods for increasing numbers of sample cohorts, phenotypic measurements or complexity of phenotypes to analyse, often requires the simulation of datasets with a specific underlying phenotype structure.

PhenotypeSimulator allows for the simulation of complex phenotypes under different models, including genetic variant effects and infinitesimal genetic effects (reflecting population structure) as well as correlated, non-genetic covariates and observational noise effects. Different phenotypic effects can be combined into a final phenotype while controlling for the proportion of variance explained by each of the components. For each component, the number of variables, their distribution and the design of their effect across traits can be customised.

Installation

Full documentation of PhenotypeSimulator is available at http://HannahVMeyer.github.io/PhenotypeSimulator/.

The current github version of PhenotypeSimulator is: 0.3.1 and can be installed via

devtools::install_github("HannahVMeyer/PhenotypeSimulator")

The current CRAN version of PhenotypeSimulator is: 0.2.2 (soon to be updated!)

A log of version changes can be found here.

Citation

Meyer, HV & Birney E (2018) PhenotypeSimulator: A comprehensive framework for simulating multi-trait, multi-locus genotype to phenotype relationships, Bioinformatics, 34(17):2951–2956

News

PhenotypeSimulator 0.3.0

Major changes

  1. Add option for non-linear transformation of simulated phenotypes: function (transformNonlinear)[https://github.com/HannahVMeyer/PhenotypeSimulator/blob/master/R/createphenotypeFunctions.R], accessible from runSimulation. Both transformed and original phenotypes are automatically returned with savePheno
  2. Replace parameter 'oxgen' in readStandardGenotypes and getCausalSNPs with 'format' - ensures proper specification of genotype format for all cases.

Minor changes

  1. In addition to full kinship, savePheno and writeStandardOutput write eigenvalues and eigenvalues of kinship matrix.
  2. Output file names have been made more consistent in savePheno and writeStandardOutput.
  3. Causal SNPs are now also saved in specified standard output format.
  4. LiMMBo has been added as output format in savePheno and writeStandardOutput(LiMMBo format)

PhenotypeSimulator 0.2.2

Minor changes

  1. Update readStandardGenotypes to be compatible with latest release of data.table (v1.11.2), see here

PhenotypeSimulator 0.2.1

Minor changes

  1. Additional tests for compatibility of input parameters with variance components functions, genotype functions and output functions.
  2. Bug fix in output function: savePheno now properly saves kinship matrix as .rds.

PhenotypeSimulator 0.2.0

Major changes

Input

  1. PhenotypeSimulator now includes readStandardGenotypes which can read externally simulated or user-provided genotypes in plink, genome, oxgen (hapgen/impute2), bimbam or simple delimited format.
  2. A user-specified correlation matrix can be provided for the simulation of the correlatedBdEffects.
  3. Short option flags for command-line use of PhenotypeSimulator were removed.

Output

  1. PhenotypeSimulator provides the option to save the simulated phenotypes and genotypes in formats compatible with a number of commonly used genetic association software (gemma, bimbam, plink, snptest) via writeStandardOutput.
  2. Intermediate phenotype components are now saved per default.
  3. Saving additional subsets of the simulated data has been removed.

Variance components

  1. Genotype simulation and kinship estimation: functions for genotype simulation and kinship estimation have been rewritten for significant speed-ups of the computation time benchmarking.

  2. geneticFixedEffects and noiseFixedEffects:

    1. The effect size distributions of the shared effects are now modelled as the product of two exponential distributions (to yield an approximately uniform distributions) or the product of a normal distribution with user-specified parameters and a standard normal distribution.

    2. The independent effects can now be specified to affect the same subset or different subsets of traits (via keepSameIndependent).

    3. The overall number of traits affected by the effects can now be specified via pTraitsAffected.

  3. correlatedBgEffects: the additional correlation between the traits can be specified by the user by providing an external correlation matrix.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("PhenotypeSimulator")

0.3.1 by Hannah Meyer, 4 months ago


https://github.com/HannahVMeyer/PhenotypeSimulator


Report a bug at https://github.com/HannahVMeyer/PhenotypeSimulator/issues


Browse source code at https://github.com/cran/PhenotypeSimulator


Authors: Hannah Meyer [aut, cre] , Konrad Rudolph [ctb]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports methods, optparse, R.utils, mvtnorm, snpStats, zoo, data.table, Rcpp, ggplot2, reshape2, dplyr

Suggests testthat, knitr, rmarkdown

Linking to Rcpp


See at CRAN