gms | German Medical Science

64. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

08. - 11.09.2019, Dortmund

omopRds: transfer of data models from OMOP to DataSHIELD/Opal

Meeting Abstract

  • Petar Horki - Institute of Medical Biometry and Statistics, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Tumor Center Freiburg (CCCF), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
  • Stefan Lenz - Institute of Medical Biometry and Statistics, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
  • Julian Gruendner - Lehrstuhl für Medizinische Informatik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
  • Christian Maier - Lehrstuhl für Medizinische Informatik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
  • Alexander Liebler - Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany; University Medical Centre Mannheim, Mannheim, Germany
  • Martin Boeker - Institute of Medical Biometry and Statistics, Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; Tumor Center Freiburg (CCCF), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 64. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Dortmund, 08.-11.09.2019. Düsseldorf: German Medical Science GMS Publishing House; 2019. DocAbstr. 244

doi: 10.3205/19gmds028, urn:nbn:de:0183-19gmds0288

Published: September 6, 2019

© 2019 Horki et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Distributed data analysis across university hospitals is greatly facilitated by a common data model (CDM) and shared vocabulary; and privacy-preserving data analysis. The former is implemented in the OMOP (Observational Medical Outcomes Partnership) CDM [1], and the latter can be achieved with the DataSHIELD platform [2], [3] on top of the Opal data warehouse (http://www.obiba.org/pages/products/opal). The German MIRACUM research consortium uses these two technologies in unison to achieve distributed privacy-preserving data analysis [4], [5], [6], [7].

With a uniform data model for a given clinical use case in the OMOP CDM, it still needs to be made available in the DataSHIELD/Opal infrastructure. The objective of this work is to describe the omopRds package which enables the transfer of data models from OMOP CDM to DataSHIELD/Opal in a seamless manner.

Implementation: R was chosen for the implementation, as it is an essential component of both OMOP/OHDSI (Observational Health Data Sciences and Informatics) and DataSHIELD. Furthermore, it allows one to integrate the current implementation directly into the existing OMOP/OHDSI user interface at a later stage.

The herewith described implementation enables not only the transfer of the data but also the associated concepts (e.g. data types) from the OMOP CDM. To that end, methods and data structures for (i) querying the OMOP database, (ii) transferring concepts from OMOP to DataSHIELD/Opal, and (iii) uploading the data to DataSHIELD/Opal were implemented. The omopRds functionality builds upon the OHDSI DatabaseConnector (https://github.com/OHDSI/DatabaseConnector) and SqlRender (https://github.com/OHDSI/SqlRender/) packages to enable querying the OMOP database directly from R in (i), and opalr (https://cran.r-project.org/web/packages/opalr/index.html) package to talk to the DataSHIELD/Opal server REST interface in (iii).

Well-defined OMOP domains and vocabularies (e.g. gender, ethnicity) could be handled automatically, whereas it was necessary to build more complex data types step by step. In particular, the functional programming paradigm of R was exploited to map OMOP domains to DataSHIELD/Opal variables.

Discussion: Our omopRds package facilitates the transfer of data models from OMOP to DataSHIELD/Opal. A notable feature of omopRds is the ability to transfer data models at different levels of complexity: e.g. for the OMOP "Gender" domain, the standard IDs (8507, 8532) are difficult to interpret without the associated codes ("M", "F") and/or name ("Male", "Female"). Unfortunately, the OMOP vocabulary does not provide a standard code for "diverse". Instead, the generic concept (ID = 0) can be annotated with the "Diverse" description. All of this information: domain, ID, code, name, and description - can be mapped to Opal variables, attributes, and categories using omopRds.

The omopRds provides a holistic solution in R, either by integrating API calls directly into the package or by providing a wrapper function around the Opal command-line scripting tool (http://opaldoc.obiba.org/en/latest/python-user-guide/index.html; e.g. to import the DataSHIELD compatible uploaded datasets into Opal data schemas).

Following the plan outlined in the MI-I-Core-Dataset(8), the initial application of omopRds is in transferring the Base Module from OMOP to DataSHIELD/Opal.

Acknowledgements: We thank Raphael Scheible for his comment on citation formatting. MIRACUM is funded by the German Federal Ministry of Education and Research (BMBF) in the Medical Informatics Initiative (FKZ 01ZZ1606H).

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, Suchard MA, Park RW, Wong IC, Rijnbeek PR, Van Der Lei J. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers. Studies in health technology and informatics. 2015;216:574.
2.
Wilson RC, Butters OW, Avraam D, Baker J, Tedds JA, Turner A, Murtagh M, Burton PR. DataSHIELD – new directions and dimensions. Data Science Journal. 2017 Apr 19;16(21):1-21. DOI: 10.5334/dsj-2017-021 External link
3.
Wolfson M, Wallace SE, Masca N, Rowe G, Sheehan NA, Ferretti V, LaFlamme P, Tobin MD, Macleod J, Little J, Fortier I. DataSHIELD: resolving a conflict in contemporary bioscience — performing a pooled analysis of individual-level data without sharing the data. International journal of epidemiology. 2010 Jul 14;39(5):1372-82. DOI: 10.1093/ije/dyq111 External link
4.
Lenz S, Zöller D, Hess M, Binder H. Architectures for distributed privacy-preserving deep learning. In: Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie, Hrsg. 63. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018. DocAbstr. 207. DOI: 10.3205/18gmds097 External link
5.
Gründner J. A queue-Poll Extension - standardised, monitored, indirect and secure DataSHIELD access to your data. 2018 DataSHIELD Workshop; 2018; Newcastle.
6.
Maier C, Lang L, Storf H, Vormstein P, Bieber R, Bernarding J, Herrmann T, Haverkamp C, Horki P, Laufer J, Berger F. Towards implementation of OMOP in a German university hospital consortium. Applied clinical informatics. 2018 Jan;9(01):054-61. DOI: 10.1055/s-0037-1617452 External link
7.
Sedlmayr M, Prokosch HU. Datenaustausch in der Forschung via OMOP/OHDSI. 2018 [Accessed 16 July 2019]. Available from: https://www.miracum.org/wp-content/uploads/2018/06/OMOP_MIRACUM_eHealth.com_Juli.2018.pdf External link
8.
Redaktionsgruppe Kerndatensatz. MI-I-Kerndatensatz. 2017 [Accessed 16 July 2019]. Available from: https://www.medizininformatik-initiative.de/sites/default/files/inline-files/MII_04_Kerndatensatz_1-0.pdf External link