gms | German Medical Science

68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

17.09. - 21.09.23, Heilbronn

Adapting study IT to the changing tide of times in a population-based cohort study on its path to FAIRness – an application example

Meeting Abstract

  • Carsten Oliver Schmidt - Universität Greifswald, Greifswald, Germany
  • Stephan Struckmann - Universitätsmedizin Greifswald, Greifswald, Germany
  • Dörte Radke - Department of Study of Health in Pomerania, University Medicine Greifswald, Greifswald, Germany
  • Adrian Richter - Institut für Community Medicine, Universitätsmedizin Greifswald, Greifswald, Germany
  • Elisa Kasbohm - Universitätsmedizin Greifswald, Greifswald, Germany
  • Susanne Westphal - Department of Study of Health in Pomerania, University Medicine Greifswald, Greifswald, Germany
  • Torsten Leddig - Universitätsmedizin Greifswald, Institut für Community Medicine, Greifswald, Germany
  • Jörg Henke - Universitätsmedizin Greifswald, Greifswald, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS). Heilbronn, 17.-21.09.2023. Düsseldorf: German Medical Science GMS Publishing House; 2023. DocAbstr. 327

doi: 10.3205/23gmds022, urn:nbn:de:0183-23gmds0221

Veröffentlicht: 15. September 2023

© 2023 Schmidt et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: The successful conduct of epidemiologic cohort studies faces contradictory demands. They may run over decades and require a high degree of infrastructural and methodological stability. For an efficient study conduct, elaborate IT environments are needed. However, the technologies used must inevitably be adapted to technological advances, and previous decisions cannot simply be jettisoned. Old and new infrastructural developments can therefore come into conflict. Recent years have also seen an increasing pressure to make studies findable, accessible, interoperable, and reusable (FAIR) [1]. This paper therefore describes how related challenges to enable an efficient study conduct were addressed in the Pomeranian Health Study as a prototypical example of a major epidemiological cohort study [2].

State of the art: The implemented tools needs to be reflected form the perspective of current syntactic and semantic standards (e.g., CDISC-ODM, FHIR, SNOMED-CT, etc.) as well as from FAIR guiding principles. An important criterion of success is the degree to which the chosen IT-implementations have enabled networked research.

Concept: Developments did take place with a strong focus on the specific needs of SHIP data collections when integrating a wide range of data sources (e.g., CRFs, imaging devices, sleep laboratory, biomarkers, medical devices). Technologies have evolved over time, e.g. from using Java Server Faces to the Spring Boot architecture.

Implementation: Over the past 15 years, a number of publicly available web-tools have been developed to conduct SHIP (e.g., SHIPPIE/SHIPDesigner for data capture and form-generation; WebMODYS for participant management; dataquieR, dqrep, and Square² for data quality assessments; FAIRequest for data applications, and a range of other tools, e.g., to handle query requests). A central data management has been implemented from early on, using ACCESS, Oracle, PostgreSQL. Currently, the PostgreSQL database management system is the main backend to achieve a fully integrated data environment. Data cleaning and initial checks are performed by a mostly automated modular SAS analysis pipeline. Data export to selected CDMs like CDISC-ODM and other formats, e.g. OPAL, is possible to display study and item level information in different networks and projects (Maelstrom, euCanSHare, NFDI4Health, Portal for Medical Data Models), and to enable federated data analyses. Cooperation with the networks has enabled the semantic tagging of tens of thousands of SHIP variables [3]. Metadata models have continuously been improved, e.g., to handle information of relevance to assess data quality.

Lessons learned: Development of proprietary software is time-consuming, and there should be a high threshold to decide in favor of and against using already available software. Yet, in a long-running study, specific requirements make any switch to the use of standard developments difficult. Despite the use of different tools, a high level of integration is possible. A detailed semantic coding, e.g. using SNOMED-CT, is currently beyond scope. SHIP tools have proven useful beyond local use, e.g., to implement a sister study in Poland. Participation in networked research enabled developments that would have been impossible otherwise. As a measure of success, SHIP has achieved an exceptionally high degree of visibility, contributing to more than 1500 peer-reviewed papers.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:160018.
2.
Volzke H, Schossow J, Schmidt CO, Jurgens C, Richter A, Werner A, et al. Cohort Profile Update: The Study of Health in Pomerania (SHIP). Int J Epidemiol. 2022;51(6):e372-e83.
3.
Bergeron J, Doiron D, Marcon Y, Ferretti V, Fortier I. Fostering population-based cohort data discovery: The Maelstrom Research cataloguing toolkit. PLoS ONE. 2018;13(7):e0200926.