gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

Comparison of classic polygenic scores with machine learning algorithms to predict hypertension

Meeting Abstract

Search Medline for

  • Tanja K. Rausch - Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Lübeck, Germany
  • Silke Szymczak - Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Lübeck, Germany
  • Inke R. König - Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Lübeck, Germany

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 954

doi: 10.3205/24gmds109, urn:nbn:de:0183-24gmds1097

Published: September 6, 2024

© 2024 Rausch et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Hypertension is the leading risk factor for the development of cardiovascular disease and, since blood pressure is a frequently measured clinical parameter, it is frequently available. Based on the polygenic heritability shown for complex traits like hypertension, polygenic scores (PGS) are increasingly being used in preclinical and clinical research to stratify individuals according to their genetic susceptibility for targeted prevention, therapy, or prognosis. However, classic PGS use a simple sum of individual genotypes, weighted by the association estimated from single variant genome-wide association studies (GWAS). Thus, multivariable and non-linear effects are not taken into account. Since classical statistical methods reach their limits when including a large numbers of independent variables, machine learning (ML) algorithms can alternatively be used for score construction.

Machine learning algorithms have not yet been applied to construct polygenic scores to predict hypertension. Therefore, it is unclear whether more complex algorithms are better able to predict hypertension than classic scores. This study aims to evaluate different ML algorithms suitable for classification problems such as random forest, LASSO, elastic net, and support vector classifier. For the benchmarking, data from the UK Biobank will be used, a biomedical database containing genetic and health information from half a million participants from the United Kingdom. Hypertension will be defined as taking blood pressure lowering medication, a diastolic blood pressure above 90 mmHg, or a systolic blood pressure above 140 mmHg at the initial assessment visit. The data set will repeatedly and randomly split into training and test data sets. The training data set will be used to generate a simple weighted PGS for hypertension by performing a GWAS and to train ML models. Hyperparameter tuning will be performed as well as variable selection where applicable. Prediction performances of the resulting models will be compared on the independent test data set by the area under the receiver operating curve (AUC).

Results will be presented at the conference. The study results provide better insight into whether compressed genetic information obtained by complex machine learning algorithms perform better than classic PGS to predict hypertension and which model performs best for classification. Additionally, the results will allow conclusions on the genetic structure of hypertension.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.