gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Large-scale empirical assessment of replicability in healthcare meta-analyses

Meeting Abstract

  • Orestis Panagiotou - Brown University, Providence, United States
  • Kirsten Voorhies - Brown University, Providence, United States
  • Iman Jaljuli - Tel-Aviv University, Tel-Aviv, United States
  • Christopher Schmid - Brown University, Providence, United States
  • Ruth Heller - Tel-Aviv University, Tel-Aviv, Israel

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 417

doi: 10.3205/20gmds045, urn:nbn:de:0183-20gmds0454

Published: February 26, 2021

© 2021 Panagiotou et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Background: Replicability of treatment effects protects patients, clinicians, and policy makers from claiming conclusive evidence solely based on the results of a single study which may be a false-positive due to chance or bias. Previous efforts to quantify replicability of clinical trials have used qualitative, inconsistent, and abstract definitions without measuring a well-defined property of the evidence base. Recent methods provide robust inferences on replicability by defining it as a statistical hypothesis problem.

Methods: We performed a replicability analysis of meta-analyses in the Cochrane Database of Systematic Reviews. We included all meta-analyses of binary outcomes with n>4 studies. We applied the partial conjunction hypothesis test to quantify the evidence for replicability. The method establishes that the treatment effect is replicated in at least u out of n studies by testing the u/n-replicability null hypothesis, ie at least n-(u-1) of the component hypotheses in a meta-analysis simultaneously hold true. It calculates a summary measure, i.e. the r-value, which is the p-value of the aforementioned null replicability hypothesis. Replicability is established if the r-value is less than the type I error α=0.05. Using the same meta-analytical methods as the Cochrane reviews, we computed the r-value for u=2 and u=3 to determine whether the treatment effect is replicated in at least 2 and at least 3 studies. For each meta-analysis, we also computed the u-max, i.e. the maximum u for which the u/n-replicability null hypothesis is rejected; u-max is the 1-α lower confidence bound on the number of studies with effect in the same direction.

Results: A total of 23,561 meta-analyses with 258,948 individual trials were eligible. The median number of studies per meta-analysis was 8 (interquartile range, IQR, 6-12) and the median sample size was 2,984 (IQR, 1,231-7,722). Replicability for u=2 was not met (r>0.05) in 15,482 (66%) meta-analyses and for u=3 in 17,738 (75%) meta-analyses. There were 9,863 statistically significant meta-analyses. Among those, replicability for u=2 was not met in 2,970 (30%) with 1 study driving the meta-analysis significance; for u=3, replicability was not met in 4,493 (46%) with 2 studies driving the significance. The median u-max was 3 (IQR, 1-5) and the median ratio of u-max to the total number of studies was 33% (IQR,14%-60%). In total, 5,078 (22%) meta-analyses had evidence of small study effects and the treatment effect was replicated in in at least two studies in 2,684 (53%) of those meta-analyses. Among statistically significant meta-analyses whose treatment effect was replicated in at least two studies (n=6,893), the treatment effect between the replicated studies and the overall meta-analysis was greater than 10% for 3,518 (51%) meta-analyses; differences in treatment effects between the replicated studies and the overall meta-analysis were significant (no overlap in confidence intervals) in 34 cases. Results were similar in when using α=0.005 and α=0.001.

Conclusion: Treatment effects are replicated in at least 2 trials up to 70% of statistically significant healthcare meta-analyses. The differences between replicated effects and the overall meta-analyses effects are small. For many meta-analyses, statistical significance is sensitive to a small number of studies relatively to the number of synthesized studies

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.