gms | German Medical Science

16. Deutscher Kongress für Versorgungsforschung

Deutsches Netzwerk Versorgungsforschung e. V.

4. - 6. Oktober 2017, Berlin

Innovative statistical procedures applied on mental healthcare data enable to save time and money

Meeting Abstract

  • Klemens Weigl - Philipps-Universität Marburg, Marburg, Germany
  • Johannes von Wilucki - Philipps University Marburg, Marburg, Germany
  • Karl H. Beine - St. Marien-Hospital Hamm, Hamm, Germany
  • Michaela Assheuer - Private Universität Witten/Herdecke gGmbH, Witten, Germany
  • Max Geraedts - Philipps-Universität Marburg, Marburg, Germany

16. Deutscher Kongress für Versorgungsforschung (DKVF). Berlin, 04.-06.10.2017. Düsseldorf: German Medical Science GMS Publishing House; 2017. DocV198

doi: 10.3205/17dkvf043, urn:nbn:de:0183-17dkvf0436

Veröffentlicht: 26. September 2017

© 2017 Weigl et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe



Background: In numerous studies, patient satisfaction with inpatient psychiatric care is considered as an important quality indicator of mental health services. However, besides financial and organizational constraints, classical single-stage studies focusing on patient satisfaction are facing the methodological and statistical drawback of no early identification of promising or harmful treatment effects.

Research Objective: Using the example of a patient satisfaction survey of two psychiatric hospitals in Germany, we apply an interim analysis on the observed data (e.g. after Stage 1 in a two-stage design) to determine whether to stop the survey prior the prescheduled end of the study or not. Additionally, we are interested if the two psychiatric hospitals differ in patient satisfaction.

Method: Among the wide field of interim analyses, we apply a highly sophisticated adaptive group sequential design which necessitates a more careful planning.

Study Design: The project is designed as a prospective cohort study. Before the application of adaptive group sequential methods, we have to specify a priori all necessary details. Hence, we choose the approach by Wang and Tsiatis [1] with K = 2 stages and power parameter Δ = .25. Then, we select the inverse normal combination test by Lehmacher and Wassmer [2] for introducing adaptivity. Thereby, we obtain the first stage early stopping and the second stage rejection boundary α1 = .00768 and c = .0208, respectively, both based on the overall α = .025 and one-sided testing (which is numerically identical to two-sided testing and α = .05). Additionally, we set the early stopping boundary α0 = 1. Finally, we compute the critical boundary of the conditional error function of A(α1) = 0.32 and set it as the lower limit to increase the sample size.

Data Collection: The data are sampled at two different psychiatric hospitals. For the application of group sequential methods, a priori sample size estimation for a two groups comparison with the quantities α = .05 (two-sided), β = .1 (for 90% power), and a medium effect size of δ = .05 yields the exact total sample size of N = 170.06. This has to be corrected by the sample size inflation factor IF = 1.034 of the chosen design by Wang and Tsiatis, which gives N(IF) = 175.85 and rounds up to the next integer such that N(IF;r.) = 176 ('r.' denotes rounded). This yields nA1 = nB1 = 44 for Stage 1 and nA2 = nB2 = 44 for Stage 2 (A and B denote the two psychiatric hospitals, respectively).

Data Analysis: The statistical data analysis has been performed with the statistics software R and IBM® SPSS® Statistics, Version 23 (SPSS: only for the t-test).

Results: The statistical prerequisites for parametric statistical analyses (on the aggregated mean score of the questionnaire) such as variance homogeneity and normality were fulfilled. After data sampling of nA1 = nB1 = 44 for Stage 1, the independent two sample t-test revealed no sufficiently large effect for an early stopping for efficacy (t(86) = .81; p = .21; one-sided p-value). Hence, the computation of the actually achieved conditional power based on the current trend yielded 10 per cent (cp = 0.1). It is below the a priori set critical boundary of the conditional error function A(α1), which indicates to not increase the sample size. In fact, the adjusted new sample size for Stage 2 to potentially achieve a significant result after Stage 2 would necessitate an unrealistically large sample size of nA2 = nB2 = 578 for 80 and nA2 = nB2 = 765 for 90 per cent power.

Discussion: Given these results after Stage 1 and the very low achieved conditional power, we decided to not increase the sample size. On top of this and based on the very clear trend and overwhelming evidence of no difference in patient satisfaction between the two psychiatric hospitals, we performed stochastic curtailment and decided to not further investigate this endpoint after Stage 1. The achieved conditional power deviates by far from the a priori intended and desired 90 per cent power for sample size estimation in the planning phase. It is obvious, that the chance of obtaining a significant result after Stage 2 is highly unlikely.

Practical Implications: The greatest practical impact of our undertaking is the saving of a total of 88 patients of the a priori planned Stage 2. In summary, by saving on average over many studies precious resources such as time and money while enhancing the ethical standard, we conclude that sequential monitoring using adaptive group sequential designs greatly improves mental health services research.


Wang SK, Tsiatis AA. Approximately optimal one-parameter boundaries for group sequential trials. Biometrics. 1987;43:193-199.
Lehmacher W, Wassmer G. Adaptive Sample Size Calculation in Group Sequential Trials. Biometrics. 1999;55:1286-1290.