gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Analyzing propensity score methods on randomized controlled clinical trials

Meeting Abstract

Search Medline for

  • Daliah Dieckmann - Merck KGaA, Darmstadt, GermanyHochschule Darmstadt, Darmstadt, Germany
  • Heiko Götte - Merck KGaA, Darmstadt, Germany
  • Antje Jahn - Hochschule Darmstadt, Darmstadt, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 78

doi: 10.3205/20gmds274, urn:nbn:de:0183-20gmds2745

Published: February 26, 2021

© 2021 Dieckmann et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Background: Propensity score methods are widely used for analyzing treatment effects in non-randomized controlled trials and observational studies. They account only for observable confounder via analysis. In contrast to that randomized trials account for observable and unobservable confounder via design, i.e. it is expected that the experimental and control group are drawn from the same population and that confounder are equally distributed between treatment arms. Nowadays, small single arm studies are quite common, especially in early drug development in oncology. Propensity score methods offer an opportunity to use health care data as external controls. Whereas a strength of this approach is the reduced sample size, we will investigate potential limitations.

Objectives: Our research question is: How do propensity score methods perform in an ideal setting, i.e. under randomized treatment allocation, with real data and different sample sizes?

Methods: To investigate propensity score methods under randomized treatment allocation, we use two randomized controlled clinical trials with time-to-event endpoint. The treatment effect that should be reproduced is measured by a hazard ratio. The samples sizes were planned to reach 90% power with a one-sided significance level of 2.5%.

Using a pool of available baseline covariates, we select the best model for the time-to-event endpoint via AIC criterion. The covariates which form this best model are included in a logistic regression model to estimate the propensity scores.

These following propensity score methods are evaluated: matching, weighting (inverse probability of treatment weighting) and stratification. Hazard ratios from propensity score analyses are then compared to those received from standard Cox modelling results.

To examine the quality of matching we use standardized mean differences as well as histograms of propensity score distributions before and after matching.

To investigate the the applicability for studies with small sample size, we apply all methods with only 20% of the experimental arm subjects from the randomized trial as well. This leads to 5 random subsets, each is matched with the complete control arm from the randomized trial as external control.

Results: Results of the propensity score methods were comparable to standard regression results under randomized treatment allocation only in one of the two clinical trial examples. There were no substantial differences in the investigated propensity score methods. We also saw that propensity score methods failed under strong differences between treated and control group in propensity score distributions. We observed high variability in the hazard ratio estimators under small sample sizes as well.

Conclusion: If external controls and propensity score methods are considered for early clinical development it is essential to have a careful evaluation in planning phase (sample size, in- and exclusion criteria, method evaluation ...). Furthermore, it is important to handle propensity score methods for early phase trials with caution.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.