gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

Improving the LIFE Research Data Request Workflow with the TOP Phenotyping Framework

Meeting Abstract

  • Christoph Beger - Institute for Medical Informatics, Statistics and Epidemiology, Leipzig University, Leipzig, Germany
  • Melanie Eberl - LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
  • Yvonne Dietz - LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
  • Franz Matthies - Institute for Medical Informatics, Statistics and Epidemiology, Leipzig University, Leipzig, Germany
  • Ralph Schäfermeier - Institute for Medical Informatics, Statistics and Epidemiology, Leipzig University, Leipzig, Germany
  • Konrad Höffner - Institute for Medical Informatics, Statistics and Epidemiology, Leipzig University, Leipzig, Germany
  • Matthias Reusche - LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
  • Alexandr Uciteli - Institute for Medical Informatics, Statistics and Epidemiology, Leipzig University, Leipzig, Germany

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 980

doi: 10.3205/24gmds114, urn:nbn:de:0183-24gmds1143

Veröffentlicht: 6. September 2024

© 2024 Beger et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: Sharing clinical research data is essential for advancing medical knowledge [1], [2]. Retrospective clinical data allows researchers to validate existing findings, explore new questions, and develop computable phenotype algorithms for efficient analysis of electronic medical records. However, accessing and utilising data often presents challenges.

The LIFE-Adult-Study, a population-based cohort investigating chronic diseases, currently relies on a custom tabular data request workflow with limited metadata. This unstandardized approach hinders interoperability as well as efficient data exploration and analysis. This work explores the application of the Terminology- and Ontology-based Phenotyping (TOP) Framework [3] to streamline the LIFE data request workflow.

Methods: We combined the existing LIFE data request workflow and the TOP Framework by developing a new data import component for the framework. The component makes LIFE data (CSV) and metadata (PDF) available to the framework via an intermediate database, which is treated like any other data source of the framework.

The resulting new data request workflow was evaluated by submitting a project agreement to the LIFE data management, with data for calculating the socio-economic status (SES) index. The indices were calculated using the TOP Framework and the SPSS-based calculation currently installed in LIFE, and the results were compared.

Results: We successfully applied the TOP Framework to socio-demographic data of approximately 10.000 LIFE-Adult participants. The evaluation showed a perfect match between the SES values calculated by the TOP Framework and the pre-calculated LIFE values. This result demonstrates the pipeline’s ability to seamlessly transfer and analyse data.

Discussion: The concept of utilising standardised terminologies and ontologies to facilitate data sharing and analysis in clinical research has gained significant traction in recent years [4], [5].

Integrating the TOP Framework into the LIFE-Adult-Study data request workflow offers significant improvements, but limitations must be considered. Direct access to the LIFE-Adult-Study data was prohibited, therefore the TOP Framework’s query functionality could not be used directly. The current implementation does not map the extracted data elements to standardised terminologies, rendering other studies unable to readily interpret the meaning of the data elements without additional work. The project agreement workflow currently in place is specifically designed for the LIFE study data management system. If we aim to utilise the TOP Framework with other studies, significant modifications will be necessary to adapt the workflow to their specific data management systems.

These limitations highlight the need for further development to expand the applicability of the TOP Framework. By addressing these issues, we will ensure the framework's generalisability and reusability beyond the LIFE-Adult-Study.

Conclusion: The TOP Framework allows domain experts to develop phenotype models (algorithms and derivatives) and to query the data of individuals with defined phenotypes. We outline a pipeline for making data artefacts provided by a data management authority (in this case, the LIFE research study) available to TOP. This enables researchers to independently model their derivatives and queries and execute them on the provided LIFE data. This work demonstrates the value of TOP in bridging the gap between data access and utilisation in clinical research settings.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Committee on Strategies for Responsible Sharing of Clinical Trial Data; Board on Health Sciences Policy; Institute of Medicine. Guiding Principles for Sharing Clinical Trial Data. In: Sharing Clinical Trial Data: Maximizing Benefits, Minimizing Risk. Washington (DC): National Academies Press (US); 2015.
2.
Modi ND, Kichenadasse G, Hoffmann TC, Haseloff M, Logan JM, Veroniki AA, et al. A 10-year update to the principles for clinical trial data sharing by pharmaceutical companies: perspectives based on a decade of literature and policies. BMC Med. 2023;21:400. DOI: 10.1186/s12916-023-03113-0 Externer Link
3.
Beger C, Matthies F, Schäfermeier R, Uciteli A. Model-driven execution of phenotype algorithms – introduction of the Terminology- and Ontology-based Phenotyping Framework. GMS Med Inform Biom Epidemiol. 2023;19:Doc17. DOI: 10.3205/MIBE000256 Externer Link
4.
Alzoubi H, Alzubi R, Ramzan N, West D, Al-Hadhrami T, Alazab M. A Review of Automatic Phenotyping Approaches using Electronic Health Records. Electronics. 2019;8:1235. DOI: 10.3390/electronics8111235 Externer Link
5.
Pathak J, Bailey KR, Beebe CE, Bethard S, Carrell DS, Chen PJ, et al. Normalization and standardization of electronic health records for high-throughput phenotyping: the SHARPn consortium. J Am Med Inform Assoc. 2013;20:e341–8. DOI: 10.1136/amiajnl-2013-001939 Externer Link