gms | German Medical Science

GMDS 2012: 57. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

16. - 20.09.2012, Braunschweig

Genome-wide, permutation-based rare variant analysis with INTERSNP-RARE

Meeting Abstract

  • Dmitriy Drichel - Deutsches Zentrum für neurodegenerative Erkrankungen, Bonn, Deutschland
  • André Lacour - Deutsches Zentrum für neurodegenerative Erkrankungen, Bonn, Deutschland
  • Christine Herold - Deutsches Zentrum für neurodegenerative Erkrankungen, Bonn, Deutschland
  • Tatsiana Vaitakhovich - Institut für medizinische Biometrie, Informatik und Epidemiologie, Bonn, Deutschland
  • Markus Leber - Institut für medizinische Biometrie, Informatik und Epidemiologie, Bonn, Deutschland
  • Vitalia Schueller - Institut für medizinische Biometrie, Informatik und Epidemiologie, Bonn, Deutschland
  • Tim Becker - Institut für medizinische Biometrie, Informatik und Epidemiologie, Bonn, Deutschland

GMDS 2012. 57. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Braunschweig, 16.-20.09.2012. Düsseldorf: German Medical Science GMS Publishing House; 2012. Doc12gmds161

DOI: 10.3205/12gmds161, URN: urn:nbn:de:0183-12gmds1617

Published: September 13, 2012

© 2012 Drichel et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.


Outline

Text

Due to growing accessibility to comprehensive, genome-wide data sets, systematic investigation of disease association with rare variants (MAF<5%) becomes an increasingly appealing strategy. We present INTERSNP-RARE, a software for genome-wide rare-variant testing using different testing procedures: CMAT (cumulative minor allele test [1]), COLL (collapsing test, a version of CMC [2]) and FR (Fisher_rare, based on the Fisher combination test). We offer an implementation of corresponding extensions to variable-threshold (VT) tests using a method based on permutations. Combined with permutation-based determination of p-value, this approach promises maximized power without overcorrection for multiple testing while accounting for LD structure.

All rare-variant tests operate on bins, physically continuous chromosomal segments. Bins can be created algorithmically, based on bp distance or number of (rare) SNPs. Additionally, creating bins based on user-supplied data in various formats (*.bed, *.gff, ...) is supported, facilitating binning strategies based on a priori information like LD block structure, genomic function or conservation status. Various functions for bin modification, e.g. merging, concatenating and flanking are supported.

Results from our power study using simulated data offer insights into strengths and shortcomings of implemented tests under different conditions. Using 20 to 60 causal, protective or neutral rare SNPs per bin, we find that the single-marker analysis outperforms other approaches in some scenarios, in particular for relatively large MAFs and few causal markers (~10%), while CMAT and COLL have excellent power in models with ~30–50% damaging and up to 20–30% protective variants. FR is well-powered even for a low fraction of causal SNPs (upwards from 10%) and highly robust with increasing number of protective markers.


References

1.
Zawistowski, et al. Am J Hum Genet. 87(5):604-17.
2.
Li, et al. Am J Hum Genet. 83:311-21.