gms | German Medical Science

49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds)
19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI)
Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie
Schweizerische Gesellschaft für Medizinische Informatik (SGMI)

26. bis 30.09.2004, Innsbruck/Tirol

Multivariable regression models with continuous covariates

Meeting Abstract (gmds2004)

Search Medline for

Kooperative Versorgung - Vernetzte Forschung - Ubiquitäre Information. 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds), 19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI) und Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI) der Österreichischen Computer Gesellschaft (OCG) und der Österreichischen Gesellschaft für Biomedizinische Technik (ÖGBMT). Innsbruck, 26.-30.09.2004. Düsseldorf, Köln: German Medical Science; 2004. Doc04gmds005

The electronic version of this article is the complete one and can be found online at:

Published: September 14, 2004

© 2004 Royston.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.



Regression models play a central role in epidemiology and clinical studies. In epidemiology the emphasis is typically either on determining whether a given risk factor affects the outcome of interest (adjusted for confounders), or on estimating a dose/response curve for a given factor, again adjusting for confounders. An important class of clinical studies is the so-called prognostic factors studies, in which the outcome for patients with chronic diseases such as cancer is predicted from various clinical features. In both application areas, it is almost always necessary to build a multivariable model incorporating known or suspected influential variables while eliminating those found to be unimportant.

It is commonplace for risk or prognostic factors to be measured on a continuous scale, an obvious example being a person's age. Conventionally, such factors are either modelled as linear functions or are converted into categories according to some chosen set of cut-points. However, categorisation and use of the resulting estimates is a procedure known to be fraught with difficulty; see for example Altman et al [1]. A linear function may fit the data badly and give misleading estimates of risk. Therefore, reliable approaches for representing the effects of continuous factors in multivariable models are urgently needed.

Building multivariable regression models by selecting influential covariates and determining the functional form of the relationship between a continuous covariate and the outcome when analysing data from clinical and epidemiological studies is the main concern of this lecture. Systematic procedures which combine selection of influential variables with determination of functional form for continuous factors are rare. Analysts may apply their individual subjective preferences for each part of the model-building process, estimate parameters for several models and then decide on the final strategy according to the results they find. By contrast, we will present here the multivariable fractional polynomial (MFP) approach as a systematic way to determine a multivariable regression model. Major concerns will be discussed, including robustness and possible model instability. Regarding determination of the functional form, we will also discuss some alternatives with more emphasis on local estimation of the function (e.g. splines). The MFP procedure may be used for various types of regression models (linear regression model, logistic model, Cox model, etc). A clinical and an epidemiological example will be used to illustrate and compare the approaches. Software for R, Stata and SAS is generally available.


Altman DG, Lausen B, Sauerbrei W, Schumacher M. 1994. The dangers of using `optimal' cutpoints in the evaluation of prognostic factors. Journal of the National Cancer Institute 86: 829-835
Royston P, Altman DG. 1994. Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling (with Discussion). Applied Statistics 43: 429-467
Royston P, Ambler G, Sauerbrei W. 1999. The use of fractional polynomials to model continuous risk variables in epidemiology. International Journal of Epidemiology, 28: 964-974
Royston P, Sauerbrei W. 2003. Stability of multivariable fractional polynomial models with selection of variables and transformations: a bootstrap investigation. Statistics in Medicine 22: 639-659
Royston P, Sauerbrei W. 2004. Improving the robustness of fractional polynomial models by preliminary covariate transformation. Statistical Modelling, submitted
Sauerbrei W, Royston P. 1999. Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. Journal of the Royal Statistical Society (A) 162: 71-94. Corrigendum: JRSS (A) 165: 399-400 (2002)