gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

Simulation of ODM, FHIR and openEHR data using DataFrames

Meeting Abstract

Suche in Medline nach

  • Johannes Oehm - Westfälische Wilhelms-Universität Münster, Münster, Germany
  • Michael Storck - Westfälische Wilhelms-Universität Münster, Münster, Germany
  • Tobias Brix - Westfälische Wilhelms-Universität Münster, Münster, Germany
  • Maximilian Fechner - Westfälische Wilhelms-Universität Münster, Münster, Germany
  • Martin Dugas - Westfälische Wilhelms-Universität Münster, Münster, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 308

doi: 10.3205/20gmds168, urn:nbn:de:0183-20gmds1686

Veröffentlicht: 26. Februar 2021

© 2021 Oehm et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: In medical software, a variety of interoperability standards is used. In the Medical Informatics Initiative [1] for example, the HiGHmed consortium [2] is using openEHR [3], while the core data set will be modelled using FHIR [4]. For clinical trials, CDISC ODM [5] is commonly used.

All these standards consist of a metadata format for modelling the medical data (which elements are available, which datatypes are present and what constraints apply to them) and a representation of the actual data. Even though the hierarchical nature of these standards is comfortable for software developers, in the end, statisticians frequently apply a transformation into a flat tabular structure (DataFrame) for analysis purposes. This tabular representation is usually stored as CSV file.

Methods: We developed a DataFrame-Generator (https://dataframegenerator.uni-muenster.de/), where the columns can be manually configured as well as imported automatically using the mentioned metadata formats. This way, statisticians can develop and test their analysis, even before the data is captured. This tool supports uniform and gaussian distributions for numeric and temporal values as well as the Java Faker [6] library for generating pseudo-realistic string values. The resulting file can be downloaded as CSV or XLSX file. The software was developed using Java 8 and Spring in the backend and Bootstrap CSS and JavaScript in the frontend.

For openEHR, metadata is represented using archetypes and templates. Because the original template format is very cumbersome, the Better Platform (the most used implementation) defines its own WebTemplate-format [7].

Results: The DataFrame-Generator can import openEHR WebTemplates and generate a DataFrame structure, where column names are derived from paths in the WebTemplate. This way, one can also directly import the generated file into the Better Platform.

For ODM, there are many different tabular representations with different advantages and disadvantages. For easy usage, we used a DataFrame-representation, where one row maps to one <SubjectData/>-element in the generated ODM file.

In FHIR, several resources are already predefined and can only be further constrained by using profiles. The FHIR Questionnaire-Resource provides a very flexible way to define metadata items. This way, the QuestionnaireResponse-Resource can be used to save instance data. The DataFrame-Generator can import a questionnaire and generate columns using a custom path syntax for identifying the corresponding items in the questionnaire.

Discussion: We have shown that it is possible to generate DataFrames with metadata from different interoperability standards. Future work is dedicated to map these DataFrames into the instance representation in the standard and how the DataFrame representation can be retrieved back from the instance data for the statistic evaluation. Other topics we are currently looking into is handling elements that may occur more often and generating realistic correlations between the data elements.

Acknowledgement: Supported by BMBF grant No. 01ZZ1802V (HiGHmed/Münster)

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Semler SC, Wissing F, Heyder R. German Medical Informatics Initiative. Methods Inf Med. 2018;57(S 01):e50-e56. DOI: 10.3414/ME18-03-0003 Externer Link
2.
Haarbrandt B, et al. HiGHmed – An Open Platform Approach to Enhance Care and Research across Institutional Boundaries. Methods Inf Med. 2018 Jul;57(S 01):e66-e81. DOI: 10.3414/ME18-02-0002 Externer Link
3.
Atalag K, Beale T, Chen R, Gornik T, Heard S, McNicoll I. A semantically enabled, vendor-independent health computing platform.
4.
Benson T, Grieve G. Principles of Health Interoperability. Springer International Publishing; 2016.
5.
Minjoe S. Introduction to the CDISC Standards. In: PharmaSUG 2013 Conference Proceedings.
6.
Java Faker. [Accessed July 2020]. Available from: https://github.com/DiUS/java-faker Externer Link
7.
Simplified Data Template. [Accessed July 2020]. Available from: https://specifications.openehr.org/releases/ITS-REST/latest/simplified_data_template.html Externer Link