gms | German Medical Science

54. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

07. bis 10.09.2009, Essen

Penalized likelihood approaches for high-dimensional model selection

Meeting Abstract

Search Medline for

  • Axel Benner - DKFZ, Heidelberg

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 54. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds). Essen, 07.-10.09.2009. Düsseldorf: German Medical Science GMS Publishing House; 2009. Doc09gmds110

DOI: 10.3205/09gmds110, URN: urn:nbn:de:0183-09gmds1104

Published: September 2, 2009

© 2009 Benner.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.


Outline

Text

One important topic of current research on observational and especially prognostic factor studies is the development of methods that can be employed to analyse high-dimensional data, where the number of explanatory variables is much larger than the number of observations. This is mainly driven by the requirements of biomedical applications such as DNA microarrays. The major problem of analyzing such data is the danger of overfitting.

Methodological challenges arise in using large sets of covariates, e.g. patients gene expression profiles, to predict survival endpoints on account of the large number of variables and their complex interdependence.

The aim of this talk is to show how penalized regression models can be employed to analyse high-dimensional data. This include linear, logistic and proportional hazards regression models.

We illustrate the different approaches using real data examples from clinical microarray studies including gene expression data. The results will be discussed with respect to the prediction error and interpretability of the results.