gms | German Medical Science

66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

26. - 30.09.2021, online

Heterogeneity tools in DataSHIELD

Meeting Abstract

Search Medline for

  • Theodoros Papakonstantinou - Institut für Medizinische Biometrie und Statistik (IMBI) Universitätsklinikum Freiburg, Freiburg, Germany
  • Daniela Zöller - Institut für Medizinische Biometrie und Statistik (IMBI) Universitätsklinikum Freiburg, Freiburg, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 66. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 12. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 26.-30.09.2021. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 175

doi: 10.3205/21gmds091, urn:nbn:de:0183-21gmds0918

Published: September 24, 2021

© 2021 Papakonstantinou et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Researchers and health care professionals often face challenges in accessing individual level data due to ethico-legal considerations. Data aggregation through anonymous summary-statistics from harmonized individual-level databases (DataSHIELD) [1], provides a novel and simple approach to such challenges. Researchers can analyze pooled data via parallelized analysis and distributed computing, in several settings, for example where restrictions in data sharing occur while a joint analysis of individual-level data from several studies is deemed necessary.

Although the DataSHIELD framework was originally developed to obtain combined results similar to the ones using pooled individual data, DataSHIELD can also be used to conduct study-level meta-analysis (SLMA), which is equivalent to a two-stage individual patient data meta-analysis. Data are first synthesized within each center and summary statistics are then combined across centers. Different centers, however, may produce different results, variation which appears as statistical heterogeneity. Quantification of heterogeneity in DataSHIELD analyses is possible but the full potential of tools to measure, visualize and explore heterogeneity has not been recognized.

Objective: We aim to present methods to measure, visualize and explore heterogeneity when individual-level data of several centers are analyzed via DataSHIELD.

Methods: We fit random effects SLMA to analyze data coming from different centers via DataSHIELD. We quantify heterogeneity across different centers using heterogeneity estimators [2] used in the meta-analysis setting. We also produce the I2 statistic, which quantifies the amount of variation among centers that cannot be explained by chance. We visualize the impact of heterogeneity in the summary results through prediction intervals; intervals that show the range in which the underlying effect of a new center is likely to lie. Moreover, we use methods available in the meta-analytic literature to explore heterogeneity; these include subgroup analysis and meta-regression, where the impact of certain characteristics in the relative summary effect is investigated. We produce a heat plot showing the impact of each study on heterogeneity measures.

Results: We use the R package meta and implement the aforementioned methods in R functions. We apply our functions in a fictional example using mean difference as effect size. The heterogeneity standard deviation is estimated at 0.42 and the prediction interval is -0.38 to 2.28. The produced heat plot (Figure 1 [Fig. 1]) shows that center 2 has a large impact on heterogeneity; removing it would result to a drop of I2 from 0.57 to 0.37 and a drop of heterogeneity standard deviation to 0.27.

Discussion - Conclusions: DataSHIELD offers the possibility of undertaking individual patient data meta-analysis without physically sharing those data. The developed methods and their implementation extend the DataSHIELD arsenal with tools that allow quantifying, encompassing, visualizing and exploring heterogeneity.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Wilson RC, Butters OW, Avraam D, Baker J, Tedds JA, Turner A, et al. DataSHIELD – New Directions and Dimensions. Data Science Journal. 2017;16:21. DOI: 10.5334/dsj-2017-021 External link
2.
Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, et al. Methods to estimate the between-study variance and its uncertainty in meta-analysis. RSM. 2016;7:55-79. DOI: 10.1002/jrsm.1164 External link