Artikel
How to link call rate, Hardy-Weinberg equilibrium and quality of SNP-data – results from the KORA 500K project
Suche in Medline nach
Autoren
Veröffentlicht: | 6. September 2007 |
---|
Gliederung
Text
Background: Based on SNP data from the KORA 500K project, a genome-wide association (GWA) study in a population-based German sample of 1644 individuals, we study the impact of the call rate (CR) on Hardy-Weinberg equilibrium (HWE) and the quality of the data.
Materials and methods: KORA (Cooperative Health Research in the Region of Augsburg, Germany) is a population based research platform. In total, four surveys have been conducted. KORA S3 consists of representative samples of 4,856 subjects. In 2003/04, 2,974 participants returned for follow-up (KORA F3). With 1644 of these subjects a genome-wide analysis has been performed using Affymetrix GeneChip 500K arrays containing approximately 500,000 SNPs. Genotypes were determined using the software BRLMM version 1.4.0. We applied statistical tools based on (realized randomized) p-values and the false discovery rate (FDR).
Results: We show that all SNPs with complete genotype information (CR=100%) are nearly perfect in HWE which militates in favor of the population being in HWE. On the other hand, we show that the proportion of SNPs being not in HWE increases linearly with decreasing CR. Hence, it can be argued that failure of HWE is not due to the investigated KORA population but to genotyping errors only.
Conclusion: This finding has important implications for the analysis of GWA studies. The use of a single threshold for HWE p-values as quality criterion cannot be recommended. Instead, a stratified analysis with different thresholds depending on CRs needs to be carried out.