Article
Simulation of ODM, FHIR and openEHR data using DataFrames
Search Medline for
Authors
Published: | February 26, 2021 |
---|
Outline
Text
Introduction: In medical software, a variety of interoperability standards is used. In the Medical Informatics Initiative [1] for example, the HiGHmed consortium [2] is using openEHR [3], while the core data set will be modelled using FHIR [4]. For clinical trials, CDISC ODM [5] is commonly used.
All these standards consist of a metadata format for modelling the medical data (which elements are available, which datatypes are present and what constraints apply to them) and a representation of the actual data. Even though the hierarchical nature of these standards is comfortable for software developers, in the end, statisticians frequently apply a transformation into a flat tabular structure (DataFrame) for analysis purposes. This tabular representation is usually stored as CSV file.
Methods: We developed a DataFrame-Generator (https://dataframegenerator.uni-muenster.de/), where the columns can be manually configured as well as imported automatically using the mentioned metadata formats. This way, statisticians can develop and test their analysis, even before the data is captured. This tool supports uniform and gaussian distributions for numeric and temporal values as well as the Java Faker [6] library for generating pseudo-realistic string values. The resulting file can be downloaded as CSV or XLSX file. The software was developed using Java 8 and Spring in the backend and Bootstrap CSS and JavaScript in the frontend.
For openEHR, metadata is represented using archetypes and templates. Because the original template format is very cumbersome, the Better Platform (the most used implementation) defines its own WebTemplate-format [7].
Results: The DataFrame-Generator can import openEHR WebTemplates and generate a DataFrame structure, where column names are derived from paths in the WebTemplate. This way, one can also directly import the generated file into the Better Platform.
For ODM, there are many different tabular representations with different advantages and disadvantages. For easy usage, we used a DataFrame-representation, where one row maps to one <SubjectData/>-element in the generated ODM file.
In FHIR, several resources are already predefined and can only be further constrained by using profiles. The FHIR Questionnaire-Resource provides a very flexible way to define metadata items. This way, the QuestionnaireResponse-Resource can be used to save instance data. The DataFrame-Generator can import a questionnaire and generate columns using a custom path syntax for identifying the corresponding items in the questionnaire.
Discussion: We have shown that it is possible to generate DataFrames with metadata from different interoperability standards. Future work is dedicated to map these DataFrames into the instance representation in the standard and how the DataFrame representation can be retrieved back from the instance data for the statistic evaluation. Other topics we are currently looking into is handling elements that may occur more often and generating realistic correlations between the data elements.
Acknowledgement: Supported by BMBF grant No. 01ZZ1802V (HiGHmed/Münster)
The authors declare that they have no competing interests.
The authors declare that an ethics committee vote is not required.
References
- 1.
- Semler SC, Wissing F, Heyder R. German Medical Informatics Initiative. Methods Inf Med. 2018;57(S 01):e50-e56. DOI: 10.3414/ME18-03-0003
- 2.
- Haarbrandt B, et al. HiGHmed – An Open Platform Approach to Enhance Care and Research across Institutional Boundaries. Methods Inf Med. 2018 Jul;57(S 01):e66-e81. DOI: 10.3414/ME18-02-0002
- 3.
- Atalag K, Beale T, Chen R, Gornik T, Heard S, McNicoll I. A semantically enabled, vendor-independent health computing platform.
- 4.
- Benson T, Grieve G. Principles of Health Interoperability. Springer International Publishing; 2016.
- 5.
- Minjoe S. Introduction to the CDISC Standards. In: PharmaSUG 2013 Conference Proceedings.
- 6.
- Java Faker. [Accessed July 2020]. Available from: https://github.com/DiUS/java-faker
- 7.
- Simplified Data Template. [Accessed July 2020]. Available from: https://specifications.openehr.org/releases/ITS-REST/latest/simplified_data_template.html