gms | German Medical Science

68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

17.09. - 21.09.23, Heilbronn

Evaluation of an ETL process for data harmonization based on FHIR and OMOP CDM

Meeting Abstract

  • Yuan Peng - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Elisa Henke - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Franziska Bathelt - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Martin Sedlmayr - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Ines Reinecke - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS). Heilbronn, 17.-21.09.2023. Düsseldorf: German Medical Science GMS Publishing House; 2023. DocAbstr. 175

doi: 10.3205/23gmds156, urn:nbn:de:0183-23gmds1564

Published: September 15, 2023

© 2023 Peng et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Observational Health Data Sciences and Informatics (OHDSI) [1] provides a standardized research repository called Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) [1]. Using OMOP CDM to harmonize patient electronic health records for research became a global trend [2].

In the context of the Medical Informatics in Research and Care in University Medicine (MIRACUM, FKZ 01ZZ1801A/L) project, we implemented an Extract-Transform-Load (ETL) process that transforms Fast Healthcare Interoperability Resources (FHIR) to OMOP CDM [3]. For ensuring the quality of our ETL-process, we present an approach to evaluate its functionality and to identify and eliminate its shortcomings in the context of MIRACUM.

Methods: Our ETL-process was developed and initially evaluated by the MIRACUM Dresden team and then made available to other MIRACUM sites for further evaluation. An evaluation process was designed consisting of tasks for developers and testers. The tasks covered the creation of description and execution instructions of the ETL-process, as well as the creation and assessment of questionnaires. The questionnaires used in the evaluation included questions about the infrastructure, quality of the documentations and the technical reports of the ETL-process.

It was planned that this evaluation process would only need to be conducted twice on our ETL-process within MIRACUM.

Results: The feedback from the initial evaluation showed the excellent quality of the documentations provided by the ETL developers and the executability of our ETL-process at all MIRACUM sites. It also revealed performance shortcomings in our ETL-process. The performance shortcomings included high Random Allocated Memory (RAM) usage and long data loading duration. Additionally, some new requirements were raised, such as bringing more details into the logging output. Our ETL-process was adjusted regarding the feedbacks and the new requirements from the initial evaluation. Subsequently, the ETL-process was evaluated again at all sites using the same evaluation process.

The results of the second evaluation showed that the RAM usage has been reduced, although the issue of long loading durations remained. Furthermore, new requirements were proposed, such as supporting new versions of MII FHIR profiles.

Discussion: By conducting the presented evaluation process, our ETL-process has made a noticeable improvement. This evaluation process is sufficient for assessing the functionality of our ETL-process and reveals unforeseeable issues in the ETL-process. Moreover, it provides a transparent platform for ETL developers and testers to communicate and collaborate. So far, the evaluation process was only conducted within MIRACUM and was limited to infrastructure information, sufficiency of the documentations and technical reports of the ETL-process. A verification of the data quality in OMOP CDM has not been done yet. This will be part of future work using the OHDSI Data Quality Dashboard.

Conclusions: Our ETL-process that transforms FHIR resources to OMOP CDM is applicable at all MIRACUM sites. The results of the presented evaluation process provide valuable information for further development and can help to drive improvements of ETL-processes in general. Therefore, the same evaluation process will be used to further improve our ETL-process within the CODEX+ project [4].

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Observational Health Data Sciences and Informatics. The book of OHDSI: Observational Health Data Sciences and Informatics. San Bernardino (CA): OHDSI; 2019.
2.
Reinecke I, Zoch M, Reich C, Sedlmayr M, Bathelt F. The Usage of OHDSI OMOP - A Scoping Review. Stud Health Technol Inform. 2021 Sep 21;283:95–103.
3.
Peng Y, Henke E, Reinecke I, Zoch M, Sedlmayr M, Bathelt F. An ETL-process design for data harmonization to participate in international research with German real-world data based on FHIR and OMOP CDM. International Journal of Medical Informatics. 2023 Jan;169:104925.
4.
CODEX+ Netzwerk Universitätsmedizin [Internet]. [cited 2023 Apr 24]. Available from: https://www.netzwerk-universitaetsmedizin.de/projekte/codex-plus External link