gms | German Medical Science

50. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds)
12. Jahrestagung der Deutschen Arbeitsgemeinschaft für Epidemiologie (dae)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie
Deutsche Arbeitsgemeinschaft für Epidemiologie

12. bis 15.09.2005, Freiburg im Breisgau

Fieller versus Efron - a comparison of two asymptotic approaches to ICER interval estimation in the presence of neglegible correlation between cost and efficacy data

Meeting Abstract

Suche in Medline nach

  • Christine Seither - Technische Universität Dresden, Dresden
  • Frank Krummenauer - Technische Universität Dresden, Dresden

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. Deutsche Arbeitsgemeinschaft für Epidemiologie. 50. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (gmds), 12. Jahrestagung der Deutschen Arbeitsgemeinschaft für Epidemiologie. Freiburg im Breisgau, 12.-15.09.2005. Düsseldorf, Köln: German Medical Science; 2005. Doc05gmds012

Die elektronische Version dieses Artikels ist vollständig und ist verfügbar unter:

Veröffentlicht: 8. September 2005

© 2005 Seither et al.
Dieser Artikel ist ein Open Access-Artikel und steht unter den Creative Commons Lizenzbedingungen ( Er darf vervielfältigt, verbreitet und öffentlich zugänglich gemacht werden, vorausgesetzt dass Autor und Quelle genannt werden.



Introduction and Purpose

Because of obvious financial ressource limitations in health care systems, therapeutic strategies are meanwhile not only evaluated from a clinical, but also from a health economic point of view to link their clinical efficacy to the underlying costs. In this context, the estimation of incremental cost effectiveness ratios (ICERs) has earnt increasing attention during this decade. ICERs relate the cost difference between therapeutical alternatives to the corresponding difference in clinical efficacy. Despite this intuitive interpretation of ICERs as “additional costs per additional benefit unit”, their statistical treatment imposes severe problems because of the necessity to estimate the distribution of a ratio of stochastically dependent distributions: If two independent treatment groups are contrasted alongside their relative cost effectiveness, the interval estimation of an ICER between these groups means the simultaneous treatment of four random variables (two cost and two efficacy distributions in each sample), which are often highly correlated. Accordingly, standard density transformation techniques for their ratio’s overall density function as a basis for moment estimation are no longer appropriate.

One approach to estimale ratio intervals in this setting is based on modifications of the Fieller theorem [1]. However, validity of interval estimates derived by the latter are crucially based on normal approximations, which must be questioned in the presence of real patient cost data: Such data are usually skewed and therefore clearly lack from normality assumption; normal approximation will therefore imply severe requirements on sample sizes.

A different approach [2] suggests the use of Efron’s Bootstrap, which seems quite promising in the actual setting: Note that the multivariate Bootstrap enables to imitate the multivariate correlation structure [3] of the underlying data distributions, and therefore can be expected to provide less biased ratio interval estimates. On the other hand, it is often ignored, that the Bootstrap approach itself is an asymptotic procedure, validity of which is limited not only by the number of Bootstrap iterations, but also by the underlying simple sizes [3], from which the Bootstrap simulation was generated from. Multivariate Berry/Esséen-type bounds for the multivariate (!) Bootsrap indicate the necessity of sample sizes, which are much larger than the standard sizes established to ensure validity of univariate normal approximations. In practice, it must be questioned, whether health economic data sets will suffice these requirements on sample size.

In summary, both strategies for ICER interval estimation crucially depend on the underlying data set’s sample size. Therefore this paper seeks to investigate the order of bias inherent in these approaches; the investigation is based on real patient data from maxillofacial surgery [4] and simulation studies based on this data set [5].

Material and Methods

Model parametrization

The following will consider two therapeutic alternatives 1 and 2, where treatment 1 denotes an established standard procedure and treatment 2 is under discussion concerning possible recommendation for founding by health care insurers. If then the random variables K1 and K2 denote the treatments’ costs and the corresponding random variables E1 and E2 the treatments’ respective efficacy indicators, the following will assume K2 > K1 and E2 > E1 (such a treatment alternative 2 is usually called “admissable” for ressource allocation). The ratio K / E is refered to as the cost effectiveness ratio (CER) and describes a treatment’s marginal costs per gained clinical benefit unit. The incremental cost effectiveness ratio (ICER) of a treatment 2 versus the standard treatment 1 is defined as ICER21 = (K2 – K1) / (E2 – E1) and estimates the additional costs, which must be invested to achieve one additional clinical benefit unit under treatment 2 instead of the standard.

Fieller and Bootstrap estimation

In the setting of two independent treatment samples, the ICER as defined here can be estimated by imputation of the samples’ respective mean estimates for K and E. The Fieller method for (one-sided) interval estimation of the ICER then concentrates on the asymptotic normality of the mean imputed difference (K2 – K1) – U (E2 – E1), where U denotes the ICER’s upper confidence bound at an appropriate confidence level. The standardised difference can be considere asymptotically normal, at least for sufficiently large sample sizes in the underlying evaluation study. Contrasting this standardized difference to the appropriate normal distribution quartile reults in a quadratic equation for U, which can be solved numerically to derive a data based asymptotic interval estimate [5].

The Bootstrap approach suggests the simulation of bivariate (!) replicates from the original bivariate cost and efficacy data for each treatment sample, respectively. A Bootstrap point estimate for the ICER can be derived by imputation of the Bootstrap estimates for the samples’ cost and efficacy means; the empirical distribution of the Bootstrap ICER is then simulated by repeating this estimation process (in the following a simulation determinant of 10.000 Bootstrap replicates was installed).

Maxillofacial Surgery Data

The above estimation procedures will be contrasted alongside the cost effectiveness evaluation of the surgical versus the conservative treatment of collum fractures [4]. The surgical intervention means the implantation of a metal platelet to attach and stabilize the fractured parts of the collum for a period of nine months; the non-invasive procedure is based on joint fixing the patient’s maxilla and mandibula by means of metal wires over three months. The latter approach does not afford surgical treatment and is therefore less cost-intensive. However, the one-year re-treatment rate of this procedure must be expected higher because of the less direct fixation of the fractured area. Efficacy of both procedures was measured in terms of quality adjusted life years (QALYs) by means of questionnaire-based interviews 36 months after end of the initial treatment. Direct treatment costs were estimated by means of the hospital documentation, costs for re-treatment within the 36 months under observation were added. The data of 67 collum fractures were analyzed (35 patients underwent surgical treatment, 32 patients conservative treatment) at the Clinic for Dentomaxillofacial Surgery at the University Hospital of Mainz.


Mean costs of 4855 € (standard deviation 312 €) versus 1970 € (290 €) were invested for surgical and conservative treatment, respectively, corresponding to a respective mean QALY gain of 27.5 QALYs (6.5) versus 21.9 QALYs (7.2). The surgical treatment therefore implied incremental costs of 515 € per additional QALY when contrasted to the non-invasive procedure.

The Fieller estimate provided an upper bound for the one-sided 95% confidence interval of this ICER point estimate of 564 € per QALY. The Bootstrap estimate based on 10.000 simulation replicates resulted in an upper confidence bound of 538 € per QALY.

This difference in interval estimates encouraged the simulation of patient data according to the above mean cost and efficacy estimates. The cost data was modeled by lognormal distributions (note that the original cost data showed a skewness of 0.8 and 1.3 in the surgical and the non-invasive treatment sample, respectively), the efficacy data by normal distributions. To imitate the empirical correlation between costs and efficacy as observed in the original data (Pearson correlation estimates 0.55 and 0.39, respectively), the bivariate distribution was generated from an appropriate binormal distribution in the first place; the cost component was then transformed into the skewed lognormal analogue. The simulation study was performed using SAS®, a total of 5.000 replications was implemeted. Details on the simulation parameters are described in [5].

Alongside the simulation, the deviation of both the Fieller and the Bootstrap upper 95% confidence level estimate from the simulated target ICER was computed; quartiles for the empirical distribution of these 5.000 deviations were derived as indicators of estimation validity. In summary, the Fieller estimate of the upper confidence level for the simulated target ICER showed a median deviation of 6.1% (interquartile range 4.5 – 7.2%) from the simulated ICER; the Bootstrap estimate imitated the target ICER’s distribution more closely and resulted in a median deviation of 1.8% (1.0 – 3.1%).


Whereas the ICER-based approach to cost effectiveness evaluation provides somewhat instructive information, its statistical treatment imposes severe model assumptions: Both the Fieller and the Bootstrap approach for interval estimation result in notably based confidence intervals. It can only be hypothesized, why the above simulation setting suggested a rather encouraging behaviour of the Bootstrap estimate: The latter is based on multivariate replication, and therefore enables to introduce the multivariate correlation structure between the cost and efficacy data into the interval estimation. Nevertheless, it must be remembered, that the Bootstrap point estimate might as well be biased in the real patient data setting: The rather small underlying sample sizes (35 versus 32 patients) do not necessarily allows for multivariate Bootstrap approximation [3]. On the other hand, the Fieller estimate turned out even more biased in the simulation study, since its dependence on asymptotic normality is even more crucial than for Bootstrap estimation.

In summary, the application of both Bootstrap and Fieller estimates to ICER interval estimation must be applied with caution. In larger sample size settings the Bootstrap approach will be less biased due to its less restrive distribution assumptions and its ability to imitate the multivariate dependence structure of the underlying bivariate data. However, the need for alternative robust approaches to interval estimation in incremental cost effectiveness evaluation [1] is obvious.


Heitjan DF. Fieller's method and net health benefits. Health Economy 2000; 9: 327-35
Wakker P, Klaassen MP. Confidence intervals for cost-effectiveness ratios. Health Economy 1995; 4: 373-81
Bickel PJ, Freedman DA. Some asymptotic theory for the Bootstrap. Annals of Statistics 9, 1196-1217
Said S. Vergleich der inkrementellen Kosteneffektivität der offenen und der geschlossenen Versorgung von Collumfrakturen aus Perspektive der Leistungserstatter. Dissertation zur Erlangung des Grades "Dr. med.", Fachbereich Medizin der Universität Mainz; 2004
Seither C. Vorschläge zur gesundheitsökonomischen Evaluation zahnärztlicher Präventionsprogramme im Kindesalter. Dissertation zur Erlangung des Grades "Dr. med. dent.", Fachbereich Medizin der Universität Mainz; 2004