gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Predicting treatment benefit using early evidence when dealing with delayed treatment effects for time-to-event endpoints

Meeting Abstract

Search Medline for

  • Rouven Behnisch - Institute of Medical Biometry and Informatics, University of Heidelberg, Heidelberg, Germany
  • Johannes Krisam - Institute of Medical Biometry and Informatics, University of Heidelberg, Heidelberg, Germany
  • Meinhard Kieser - Institute of Medical Biometry and Informatics, University of Heidelberg, Heidelberg, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 207

doi: 10.3205/20gmds297, urn:nbn:de:0183-20gmds2972

Published: February 26, 2021

© 2021 Behnisch et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Background: In cancer drug research and development, immunotherapy plays an ever more important role with its aim to harness and augment the immune system. Since the immune system needs some time to respond to this kind of therapy, a common feature of immunotherapies is a delayed treatment effect [1]. Especially in immunooncology trials, where the outcome of interest is a time-to-event variable, this poses a considerable challenge to biostatistics: The standard statistical methods, that are often required by regulatory authorities, require a quite long follow-up period to detect an effect. Hence, the question arises whether methods exist that are more susceptible to delayed effects and that can be applied early on to generate evidence anticipating the final decision of the log-rank test to reduce the trial duration.

Methods: The most commonly used test for comparing two survival curves is the log-rank test which is known to yield maximum power under the proportional hazard assumption (PH) but suffers from a substantial loss in power if this assumption is violated [2]. To overcome this problem, several other methods have been developed, for example applying weights to the log-rank statistic that can either be fixed at the design stage of the trial [3] or chosen based on the observed data [4], [5]. Alternatively, tests based on the restricted mean survival time [6], survival proportions, or accelerated failure time (AFT) models [7] might be applied.

We will compare these different methods systematically with regard to type I error control and power in the presence of delayed treatment effects. We will then investigate by simulation studies whether and to what extent these methods can predict the decision of the log-rank test early on. To this end, the results of the alternative methods at an early stage will be compared with the final decision of the log-rank test in terms of sensitivity and specificity (probability to anticipate a positive or negative decision) taking several scenarios of delayed treatment effects under various parameter situations into consideration.

Results: First results show that, by construction, the weighted log-rank tests which place more weight on late time points have a greater power to detect differences when the treatment effect is delayed. Consequently, these tests have a high sensitivity but tend to exceedingly detect effects that cannot be confirmed by the log-rank test later on, thus reducing their specificity. On the contrary, tests with a lower power at an early stage achieve a higher specificity to compensate the decreased sensitivity.

Conclusion: There are various possible alternatives to the log-rank test to analyze data that do not meet the proportional hazards assumption due to a delayed treatment effect. Some of these methods are specifically designed to have greater power for assessing late differences. However, they should be used carefully since they tend to detect differences that cannot be verified by the log-rank test. Methods that consider weights based on the observed data could hence represent a more adequate alternative to create preliminary evidence for the decision of the log-rank test in the final analysis.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Chen T. Statistical issues and challenges in immuno-oncology. Journal for Immuno Therapy of cancer. 2013;1:18. DOI: 10.1186/2051-1426-1-18 External link
2.
Peto R, Peto J. Asymptotically Efficient Rank Invariant Test Procedures. Journal of the Royal Statistical Society Series A (General). 1972;135(2):185-207.
3.
Fleming TR, Harrington DP. Counting Processes and Survival Analysis. New York [u.a.]: Wiley-Interscience Publ.; 1991
4.
Yang S, Prentice R. Improved Logrank-Type Tests for Survival Data Using Adaptive Weights. Biometrics. 2010;66(1):30-38.
5.
Magirr D, Burman C. Modestly weighted logrank tests. Statistics in Medicine. 2019;38(20):3782-3790.
6.
Royston P, Parmar MKB. Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC medical research methodology. 2013;13(1):152.
7.
Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. New York [u.a.]: Wiley; 1980.