gms | German Medical Science

49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds)
19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI)
Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie
Schweizerische Gesellschaft für Medizinische Informatik (SGMI)

26. bis 30.09.2004, Innsbruck/Tirol

Semiparametric Modelling of Particulate Matter Time Series: Concepts, Pitfalls and Practical Consequences

Meeting Abstract (gmds2004)

Suche in Medline nach

  • corresponding author presenting/speaker Michael G. Schimek - Medizinische Universität Graz, Graz, Österreich
  • Manfred Neuberger - Medizinische Universität Wien, Wien, Österreich

Kooperative Versorgung - Vernetzte Forschung - Ubiquitäre Information. 49. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds), 19. Jahrestagung der Schweizerischen Gesellschaft für Medizinische Informatik (SGMI) und Jahrestagung 2004 des Arbeitskreises Medizinische Informatik (ÖAKMI) der Österreichischen Computer Gesellschaft (OCG) und der Österreichischen Gesellschaft für Biomedizinische Technik (ÖGBMT). Innsbruck, 26.-30.09.2004. Düsseldorf, Köln: German Medical Science; 2004. Doc04gmds161

Die elektronische Version dieses Artikels ist vollständig und ist verfügbar unter:

Veröffentlicht: 14. September 2004

© 2004 Schimek et al.
Dieser Artikel ist ein Open Access-Artikel und steht unter den Creative Commons Lizenzbedingungen ( Er darf vervielfältigt, verbreitet und öffentlich zugänglich gemacht werden, vorausgesetzt dass Autor und Quelle genannt werden.




Particulate Matter (PM10 and below) is a component of air pollution with a strong detrimental effect on human health. The APHEA research program in Europe [5] and the NMMAPS research program in the US [2] aim at quantifying this relationship. Generalized Additive Models (GAMs) as implemented in S-PLUS, have become the de facto tool for statistical analysis in this area. Unfortunately, in the spring of 2002, it became known that the iterative backfitting algorithm for GAMs requires stringent convergence criteria for reliable results [1], [6]. The default criteria in the S-Plus package were blamed for this problem. This has led to heated discussions and some publicity. In the US many epidemiological studies have been re-evaluated since. In some instances corrections of estimated coefficients for the relationship between PM10 and mortality or morbidity are reported. In this presentation we discuss on more general grounds the problematic of using the classical GAM concept in a semiparametric setting, i.e. estimating the PM predictor effects parametrically while controlling for the other environmental and climate variables (covariates) in a nonparametric fashion.


The classical GAMs due to [4], a fully nonparametric concept, are introduced and their computational machinery behind, the backfitting algorithm and its variants, characterised. We point out that the original idea behind does not comply with the demands of epidemiological PM modelling. Next the semiparametric extension of GAMs is motivated and explained why this concept has become so popular among the PM community from a purely applied point of view. The pitfalls (see above) can only be understood against the background of statistical methodology and numerical mathematics [7]. We argue that poor convergence is solely an indicator of underlying problems: these are an inappropriate modelling strategy or data (note that we are confronted with time-dependent observations) not compatible with the model or a mixture of both.

It is shown how results from a classical GAM can be improved, even in the semiparametric context. Moreover alternative GAM concepts (e.g. [3]) are introduced and compared. For this discussion methodological as well as computational aspects are of interest. Finally the consequences with respect to the use of S-Plus are outlined.


The different semiparametric GAM approaches are illustrated on PM readings, other covariate measurements and relevant hospital admission data from the Upper Austrian town Linz (AUPHEP). The obtained results are often but not always the same.


We draw conclusions from the methodological and computational features characterising the different semiparametric GAM concepts. Further we give hints for the practice of epidemiological data analysis, also in terms of which software to use and how, and in terms of quality control.


The data set for illustration comes from AUPHEP, a project which has been funded by the Austrian Ministry for the Environment, Youth and Family Affairs, the Austrian Ministry for Science and Traffic, and the Austrian Academy of Sciences.


Dominici F et al. (2002) On The Use of Generalized Additive Models in Time-Series Studies of Air Pollution and Health. American Journal of Epidemiology, 156: 193-203.
Dominici, F, Samet, J.M, Zeger SL (2001) Combining Evidence on Air Pollution and Daily Mortality from the 20 Largest US Cities: a Hierarchical Modelling Strategy (with discussion). Journal of the Royal Statistical Society, A, 163: 263-302.
Eilers PHC, Schimek MG (2003) Generalized Additive Models in Particulate Matter Studies: Statistical and Computational Perspectives. Bulletin de l'Institut International de Statistique, 54ème Session, Livraison 1, 332-335.
Hastie T, Tibshirani R (1990) Generalized Additive Models. Chapman and Hall, London.
Katsouyanni, K et al. (2001) Confounding and Effect Modification in the Short-Term Effects of Ambient Particles on Total Mortality: Results from 29 European Cities within the APHEA2 Project. Epidemiology 12: 521-531.
Katsouyanni, K et al. (2002) Different Convergence Parameters Applied to the S-PLUS GAM function. Epidemiology, 13: 742.
Schimek MG and Turlach B (2000) Additive and Generalized Additive Models. In Schimek MG (ed.) Smoothing and Regression. Approaches, Computation and Application. New York: John Wiley, 277-327.