Artikel
Analyzing propensity score methods on randomized controlled clinical trials
Suche in Medline nach
Autoren
Veröffentlicht: | 26. Februar 2021 |
---|
Gliederung
Text
Background: Propensity score methods are widely used for analyzing treatment effects in non-randomized controlled trials and observational studies. They account only for observable confounder via analysis. In contrast to that randomized trials account for observable and unobservable confounder via design, i.e. it is expected that the experimental and control group are drawn from the same population and that confounder are equally distributed between treatment arms. Nowadays, small single arm studies are quite common, especially in early drug development in oncology. Propensity score methods offer an opportunity to use health care data as external controls. Whereas a strength of this approach is the reduced sample size, we will investigate potential limitations.
Objectives: Our research question is: How do propensity score methods perform in an ideal setting, i.e. under randomized treatment allocation, with real data and different sample sizes?
Methods: To investigate propensity score methods under randomized treatment allocation, we use two randomized controlled clinical trials with time-to-event endpoint. The treatment effect that should be reproduced is measured by a hazard ratio. The samples sizes were planned to reach 90% power with a one-sided significance level of 2.5%.
Using a pool of available baseline covariates, we select the best model for the time-to-event endpoint via AIC criterion. The covariates which form this best model are included in a logistic regression model to estimate the propensity scores.
These following propensity score methods are evaluated: matching, weighting (inverse probability of treatment weighting) and stratification. Hazard ratios from propensity score analyses are then compared to those received from standard Cox modelling results.
To examine the quality of matching we use standardized mean differences as well as histograms of propensity score distributions before and after matching.
To investigate the the applicability for studies with small sample size, we apply all methods with only 20% of the experimental arm subjects from the randomized trial as well. This leads to 5 random subsets, each is matched with the complete control arm from the randomized trial as external control.
Results: Results of the propensity score methods were comparable to standard regression results under randomized treatment allocation only in one of the two clinical trial examples. There were no substantial differences in the investigated propensity score methods. We also saw that propensity score methods failed under strong differences between treated and control group in propensity score distributions. We observed high variability in the hazard ratio estimators under small sample sizes as well.
Conclusion: If external controls and propensity score methods are considered for early clinical development it is essential to have a careful evaluation in planning phase (sample size, in- and exclusion criteria, method evaluation ...). Furthermore, it is important to handle propensity score methods for early phase trials with caution.
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.