gms | German Medical Science

66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

26. - 30.09.2021, online

Statistical Inference for Diagnostic Test Accuracy Studies with Multiple Comparisons: The R Package DTAmc

Meeting Abstract

Search Medline for

  • Max Westphal - Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany
  • Antonia Zapf - Department of Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 26.-30.09.2021. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 130

doi: 10.3205/21gmds077, urn:nbn:de:0183-21gmds0770

Published: September 24, 2021

© 2021 Westphal et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Diagnostic accuracy studies are usually designed to assess the sensitivity and specificity of an index test in relation to a reference standard or established comparative test. This so-called co-primary endpoint analysis has recently been extended to the case that multiple index tests are investigated in the same study [1]. Such a (paired) design allows to simultaneously assess the diagnostic accuracy of different biomarkers or risk models. It also allows simultaneous inference for different cutpoints (for a single biomarker/model) which imply different decision rules. These properties are highly relevant in modern applications where multiple promising biomarkers and/or models are usually derived from high-dimensional data but the single best candidate can often not be identified with certainty before the evaluation study.

In this talk, we give an overview over suitable multiple test procedures for the investigated scenario. Besides classical parametric corrections (Bonferroni, maxT), we also consider Bootstrap approaches and a Bayesian procedure. All methods are implemented in the new R package DTAmc which is also presented. A thorough simulation study was conducted to compare the (family-wise) error rate and power of these procedures [2]. An important observation from the simulation study is the wide variability of rejection rates in different settings. Under least-favourable parameter configurations several methods fail to control the FWER for smaller sample sizes. In a more realistic simulation setting, FWER control is no problem but differences in statistical power are more pronounced. We discuss these findings and point out consequences for the practitioner.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Westphal M, Zapf A, Brannath W. A multiple testing framework for diagnostic accuracy studies with co-primary endpoints [Preprint]. arXiv. 2019. arXiv:1911.02982
2.
Westphal M, Zapf A. Statistical Inference for Diagnostic Test Accuracy Studies with Multiple Comparisons. Forthcoming 2021.