gms | German Medical Science

67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

21.08. - 25.08.2022, online

Extract, Transform and Load German Claim Data to OMOP CDM – Design and Implications

Meeting Abstract

  • Michele Zoch - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Elisa Henke - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Ines Reinecke - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Yuan Peng - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Richard Gebler - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Mirko Gruhl - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Martin Sedlmayr - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 21.-25.08.2022. Düsseldorf: German Medical Science GMS Publishing House; 2022. DocAbstr. 153

doi: 10.3205/22gmds057, urn:nbn:de:0183-22gmds0573

Published: August 19, 2022

© 2022 Zoch et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Secondary use of clinical data is essential for generating evidence in medical research. To harmonize data it is beneficial to use common data models (CDM) [1]. The Observational Medical Outcomes Partnership (OMOP) CDM from the Observational Health Data Sciences and Informatics (OHDSI) is an important repository for observational data [2].

Efforts are already underway to transfer German patient data to OMOP: It focuses on mappings and tool development, but participation in clinical trials is still a gap [3]. To close this gap a transfer of national, visit related data utilizing German terminologies into a harmonized patient centric CDM with international terminologies is required. Thus, we offer an approach to extract, transform and load (ETL) German claim data to OMOP that ensures reusability by sticking to interoperability standards and transparent documentation.

State of the art: Maier et al. [4] provided an initial step to load German data, like diagnosis and procedures, into OMOP. This was extended by mappings of e.g. admission / discharge reason and departments (5).

Besides CDMs, the Fast Healthcare Interoperability Resources (FHIR) evolved as a leading standard for data exchange [5]. Hence, an ETL-job of the FHIR-defined core data set of the German Medical Informatics Initiative (MII) [6] to OMOP is currently developed [7].

Concept: For the ETL-job well-defined German claim data (CSV files) are used [8]. According to §21 of the Hospital Remuneration Act all German hospitals have to submit their performance data to the Institute for Hospital Remuneration System (InEK).

An interdisciplinary team of physicians and computer scientists identified and agreed on the mapping to OMOP.

Implementation: The ETL-job was created in Pentaho Data Integration [9] and resulted in a ready-to-use java ETL-job. To allow runtime flexibility it was dockerized and the transformations can – depending on the workload – be executed serially or in parallel. It will be available on OHDSI Germany’s GitHub.

Lessons learned: Performance of the generated ETL-job is sufficient; the associated data are in Table 1 [Tab. 1]. The input data is equivalent to hospital data in one year.

A data quality check utilizing ACHILLES HEEL (low number of errors) and the creation of cohorts in ATLAS were successful. The team independently checked the content quality of the output.

The ETL-job has already been successfully tested by ten university hospitals within the MIRACUM project (FKZ 01ZZ1801A/L) [10].

The presented ETL-job is sustainable as it bases on the well-structured law compliant data. Thus, other German non-academic hospitals outside the MII are also able to transfer data to OMOP. As part of MiHubX (FKZ 01ZZ2101A) [11] regional hospitals are currently undertaking first efforts to run the ETL-job. It provides an important step to prepare them for participation in international research projects within the OHDSI community.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Garza M, Del Fiol G, Tenenbaum J, Walden A, Zozus MN. Evaluating common data models for use with a longitudinal community registry. J Biomed Inform. 2016;64:333–41.
2.
Hripcsak G, Duke JD, ShahNH, Reich CG, Huser V, Schuemie MJ, et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Stud Health Technol Inform. 2015;(216):574–8.
3.
Reinecke I, Zoch M, Reich C, Sedlmayr M, Bathelt F. The Usage of OHDSI OMOP - A Scoping Review. Stud Health Technol Inform. 2021;283:95–103.
4.
Maier C, Lang L, Storf H, Vormstein P, Bieber R, Bernarding J, et al. Towards Implementation of OMOP in a German University Hospital Consortium. Appl Clin Inform. 2018;09(01):054–61.
5.
Lehne M, Luijten S, Vom Felde Genannt Imbusch P, Thun S. The Use of FHIR in Digital Health – A Review of the Scientific Literature. Ger Med Data Sci Shap Change – Creat Solut Innov Med. 2019:52–8.
6.
Ganslandt T, Boeker M, Löbe M, Prasser F, Schepers J, Semler S, et al. Der Kerndatensatz der Medizininformatik-Initiative: Ein Schritt zur Sekundärnutzung von Versorgungsdaten auf nationaler Ebene. Forum Med-Dok Med-Inform. 2018;20(1):17–21.
7.
Henke E, Peng Y, Reinecke I, Zoch M, Sedlmayr M. Development of an ETL Process for Bulk and Incremental Load of German Patient Data into OMOP CDM Using FHIR. In: OHDSI 2021 Global Symposium. 2021 [cited 2022 Mar 15]. Available from: https://www.ohdsi.org/2021-global-symposium-showcase-44/ External link
8.
InEK GmbH. Datensatzbeschreibung [Internet]. [cited 2022 Mar 16]. Available from: https://www.g-drg.de/Datenlieferung_gem._21_KHEntgG/Datenlieferung_gem._21_Abs.1_KHEntgG/Dokumente_zur_Datenlieferung/Datensatzbeschreibung External link
9.
Hitachi. Pentaho Data Integration and Analytics-Plattform [Internet]. [cited 2022 Mar 22]. Available from: https://www.hitachivantara.com/de-de/products/data-management-analytics/pentaho-platform.html External link
10.
Prokosch HU, Acker T,Bernarding J, Binder H, Boeker M, Boerries M, et al. MIRACUM: Medical Informatics in Research and Care in University Medicine. Methods Inf Med. 2018;57(S 01):e82–91.
11.
Bundesministerium für Bildung und Forschung. MiHUBx: ein digitales Ökosystem für Forschung,Diagnostik und Therapie. Digitale FortschrittsHubs Gesundheit. [cited 2022 Mar 21]. Available from: https://www.gesundheitsforschung-bmbf.de/de/mihubx-ein-digitales-okosystem-fur-forschung-diagnostik-und-therapie-13054.php External link
12.
Kümmel M, Reinecke I, Gruhl M, Bathelt F, Sedlmayr M. Transition Database for a harmonized mapping of German patient data to the OMOP CDM. In: 2020 OHDSI European Symposium. 2020 [cited 2020 Nov 28]. Available from: https://www.ohdsi.org/2020-eu-symposium-showcase-13/ External link