gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Power Calculations for Replication Studies with an Interim Analysis

Meeting Abstract

Suche in Medline nach

  • Charlotte Micheloud - University of Zurich, Zurich, Switzerland
  • Leonhard Held - University of Zurich, Zurich, Switzerland

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 60

doi: 10.3205/20gmds142, urn:nbn:de:0183-20gmds1421

Veröffentlicht: 26. Februar 2021

© 2021 Micheloud et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

As a consequence of the so-called ‘replication crisis’ [1], an increasing number of replication studies have been conducted to determine the reliability of original findings. Ideally the procedures of the replication study should be as closely matched to the original study as possible. However, selecting the same sample size in the replication study as in the original study can lead to a severely under-powered design and true effects may not be detected [2]. Larger sample sizes are usually required in replication studies and most replication projects use conditional power calculations based on the effect estimate from the original study to determine this sample size. This approach is not well suited as it does not take the uncertainty of the original result into account.

In clinical trials, Bayesian methods are often used to incorporate prior information to power calculations. We propose to adapt this methodology to the replication setting. A sensible prior in this context is a normal prior centered around the original effect estimate with variance inversely proportional to the original sample size [3]. The resulting power is called ‘predictive’ [4] and has not been used so far to design replication studies. Moreover, as many replication projects are being undertaken, optimal allocation of resources is particularly important. We describe how the methodology used in sequential clinical trials and drug development [5], [6]) can be tailored to replication studies. The predictive interim power, i.e. the predictive power of the replication study conditioned on the data already collected, is shown to be useful to decide whether to stop an experiment at interim. In addition to an informed prior, a flat prior can also be used in this setting.

At the start of the replication study as well as at interim, predictive power turns out to be a different concept than conditional power. It generally leads to smaller values than conditional power and does not always increase when increasing the replication sample size. Remarkably, for large sample sizes the predictive power cannot exceed one minus the one-sided interim p-value. Adding more subjects to the replication study can even decrease the predictive interim power if the p-value at interim is only 'suggestive', i.e. slightly below the significance level.

We illustrate these properties using data from a large-scale replication project on the replicability of social sciences [7]. This project reports replication results of 21 studies and was conducted in a sequential manner.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Begley CG, Ioannidis JP. Reproducibility in science. Circulation Research. 2015;116:116–126.
2.
Goodman SN. A comment on replication, p-values and evidence. Statistics in Medicine. 1992;11:875–879.
3.
Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian Approaches to Clinical Trials and Health-Care Evaluation. John Wiley & Sons; 2004.
4.
Spiegelhalter DJ, Freedman LS. A predictive approach to selecting the size of a clinical trial, based on subjective clinical opinion. Statistics in Medicine. 1986;5:1–13.
5.
Dallow N, Fina P. The perils with the misuse of predictive power. Pharmaceutical Statistics. 2011;10:311–317.
6.
Rufibach K, Burger HU, Abt M. Bayesian predictive power: choice of prior and some recommendations for its use as probability of success in drug development. Pharmaceutical Statistics. 2016;15:438–446.
7.
Camerer CF, Dreber A, Holzmeister F, Ho TH, Huber J, Johannesson M, Kirchler M, Nave, G, Nosek BA, Pfeiffer T, Altmejd A, Buttrick N, Chan T, Chen Y, Forsell E, Gampa A, Heikensten E, Hummer L, Imai T, Isaksson S, Manfredi D, Rose J, Wagenmakers EJ, Wu H. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour. 2018;2:637–644.