Infers Novel Immunoglobulin Alleles from Sequencing Data

Infers the V genotype of an individual from immunoglobulin (Ig) repertoire sequencing data (AIRR-Seq, Rep-Seq). Includes detection of any novel alleles. This information is then used to correct existing V allele calls from among the sample sequences. Citations: Gadala-Maria, et al (2015) . Gadala-Maria, et al (2019) .

High-throughput sequencing of B cell immunoglobulin receptors is providing unprecedented insight into adaptive immunity. A key step in analyzing these data involves assignment of the germline V, D and J gene segment alleles that comprise each immunoglobulin sequence by matching them against a database of known V(D)J alleles. However, this process will fail for sequences that utilize previously undetected alleles, whose frequency in the population is unclear.

TIgGER is a computational method that significantly improves V(D)J allele assignments by first determining the complete set of gene segments carried by an individual (including novel alleles) from V(D)J-rearrange sequences. TIgGER can then infer a subject's genotype from these sequences, and use this genotype to correct the initial V(D)J allele assignments.

The application of TIgGER continues to identify a surprisingly high frequency of novel alleles in humans, highlighting the critical need for this approach. (TIgGER, however, can and has been used with data from other species.)

Core Abilities

  • Detecting novel alleles
  • Inferring a subject's genotype
  • Correcting preliminary allele calls

Required Input

  • A table of sequences from a single individual, with columns containing the following:
    • V(D)J-rearranged nucleotide sequence (in IMGT-gapped format)
    • Preliminary V allele calls
    • Preliminary J allele calls
    • Length of the junction region
  • Germline Ig sequences in IMGT-gapped fasta format (e.g., as those downloaded from IMGT/GENE-DB)

The former can be created through the use of IMGT/HighV-QUEST and Change-O.


For help, questions, or suggestions, please contact the Immcantation Group or use the issue tracker.


Version 0.3.1 October 19, 2018

  • Fixed a fatal error in reassignAlleles with non-existent v_call column.
  • Fixed bug in generateEvidence that was reporting amino acids mutations as NA instead of gaps.

Version 0.3.0 October 3, 2018

Bug Fixes:

  • Fixed a bug in reassignAlleles occuring with single match genotypes.
  • Fixed selectNovel improperly removing all identical novel alleles, rather than keeping a single entry.
  • genotypeFasta will now retain IMGT-numbering spacers as . characters instead of converting them to - characters.
  • Fixed a bug in findNovelAlleles causing overly aggressive minimum sequence threshold filtering.
  • Fixed a bug in the grouping behavior of getPopularMutationCount.

New Features:

  • Added a Bayesian approach to genotype inferrence as the inferGenotypeBayesian function.
  • Added the function generateEvidence to build a complete evidence table from the results of findNovelAlleles, inferGenotype, inferGenotypeBayesian, and reassignAlleles.
  • Added multiple new evidence columns to the output of findNovelAlleles and adjusted the definitions/names of some existing columns.
  • Added behavior to the keep_gene argument of reassignAlleles to provide options for maintaining reassignments at the gene (previous TRUE behavior), family, or repertoire level.
  • Improved tie resolution in findNovelAlleles.

Backwards Incompatible Refactors:

  • Renamed sample data from germline_ighv, sample_db, genotype and novel_df to GermlineIGHV, SampleDb, SampleGenotype and SampleNovel, respectively.
  • Renamed the novel_df argument to novel in selectNovel, inferGenotype, and genotypeFasta.
  • Renamed the novel_df_row argument to novel_row in plotNovel.
  • Argument order in inferGenotype was alter for clarity.
  • Changed the return behavior of reassignAlleles so that it returns the input data.frame with the V_CALL_GENOTYPED column appended or overwritten.
  • cleanSeqs will no longer replace . characters with -.

Version 0.2.11 September 21, 2017

  • Improved memory utilization in findNovelAlleles.

Version 0.2.10 July 1, 2017

  • Bugfix wherein inferGenotype would break when performing check for alleles that could not be distinguished.

Version May 16, 2017

  • Bugfix wherein inferGenotype would break if all sequences submitted were from a single gene and find_unmutated was set to TRUE.

Version 0.2.9: March 24, 2017

  • License changed to Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).

Version 0.2.8: August 26, 2016

  • Bugfix following recent update of alakazam (0.2.5) to import selectively.
  • Removed unneeded dependency on shazam package (not needed as of

Version 0.2.7: July 24, 2016

  • More updates to work with the latest version of dplyr (0.5.0).
  • Bugfix in findNovelAlleles when allele passed germline_min but not min_seqs.
  • Fixed vignette typo and updated findUnmutatedCalls man page.

Version 0.2.6: July 01, 2016

  • Updated code to work with the latest version of dplyr (0.5.0).

Version June 10, 2016

  • Fixed a bug werein findNovelAlleles() was not running in parallel, even when nproc > 1.
  • Changed default to nproc=1 in findNovelAlleles().

Version 0.2.5: June 07, 2016

  • Initial CRAN release.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.0.0 by Jason Vander Heiden, 2 years ago

Report a bug at

Browse source code at

Authors: Daniel Gadala-Maria [aut] , Susanna Marquez [aut] , Moriah Cohen [aut] , Jason Vander Heiden [aut, cre] , Gur Yaari [aut] , Steven Kleinstein [aut, cph]

Documentation:   PDF Manual  

AGPL-3 license

Imports alakazam, shazam, dplyr, doParallel, foreach, graphics, gridExtra, gtools, iterators, lazyeval, parallel, rlang, stats, stringi, tidyr

Depends on ggplot2

Suggests knitr, rmarkdown, testthat

Imported by rabhit.

See at CRAN