Article
Large-scale empirical assessment of replicability in healthcare meta-analyses
Search Medline for
Authors
Published: | February 26, 2021 |
---|
Outline
Text
Background: Replicability of treatment effects protects patients, clinicians, and policy makers from claiming conclusive evidence solely based on the results of a single study which may be a false-positive due to chance or bias. Previous efforts to quantify replicability of clinical trials have used qualitative, inconsistent, and abstract definitions without measuring a well-defined property of the evidence base. Recent methods provide robust inferences on replicability by defining it as a statistical hypothesis problem.
Methods: We performed a replicability analysis of meta-analyses in the Cochrane Database of Systematic Reviews. We included all meta-analyses of binary outcomes with n>4 studies. We applied the partial conjunction hypothesis test to quantify the evidence for replicability. The method establishes that the treatment effect is replicated in at least u out of n studies by testing the u/n-replicability null hypothesis, ie at least n-(u-1) of the component hypotheses in a meta-analysis simultaneously hold true. It calculates a summary measure, i.e. the r-value, which is the p-value of the aforementioned null replicability hypothesis. Replicability is established if the r-value is less than the type I error α=0.05. Using the same meta-analytical methods as the Cochrane reviews, we computed the r-value for u=2 and u=3 to determine whether the treatment effect is replicated in at least 2 and at least 3 studies. For each meta-analysis, we also computed the u-max, i.e. the maximum u for which the u/n-replicability null hypothesis is rejected; u-max is the 1-α lower confidence bound on the number of studies with effect in the same direction.
Results: A total of 23,561 meta-analyses with 258,948 individual trials were eligible. The median number of studies per meta-analysis was 8 (interquartile range, IQR, 6-12) and the median sample size was 2,984 (IQR, 1,231-7,722). Replicability for u=2 was not met (r>0.05) in 15,482 (66%) meta-analyses and for u=3 in 17,738 (75%) meta-analyses. There were 9,863 statistically significant meta-analyses. Among those, replicability for u=2 was not met in 2,970 (30%) with 1 study driving the meta-analysis significance; for u=3, replicability was not met in 4,493 (46%) with 2 studies driving the significance. The median u-max was 3 (IQR, 1-5) and the median ratio of u-max to the total number of studies was 33% (IQR,14%-60%). In total, 5,078 (22%) meta-analyses had evidence of small study effects and the treatment effect was replicated in in at least two studies in 2,684 (53%) of those meta-analyses. Among statistically significant meta-analyses whose treatment effect was replicated in at least two studies (n=6,893), the treatment effect between the replicated studies and the overall meta-analysis was greater than 10% for 3,518 (51%) meta-analyses; differences in treatment effects between the replicated studies and the overall meta-analysis were significant (no overlap in confidence intervals) in 34 cases. Results were similar in when using α=0.005 and α=0.001.
Conclusion: Treatment effects are replicated in at least 2 trials up to 70% of statistically significant healthcare meta-analyses. The differences between replicated effects and the overall meta-analyses effects are small. For many meta-analyses, statistical significance is sensitive to a small number of studies relatively to the number of synthesized studies
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.