gms | German Medical Science

62. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

17.09. - 21.09.2017, Oldenburg

Generation of Semantically Annotated Data Landscapes of Four German University Hospitals

Meeting Abstract

  • Julian Varghese - Westfälische Wilhelms-Universität Münster, Münster, Deutschland
  • Philipp Bruland - Westfälische Wilhelms-Universität Münster, Münster, Deutschland
  • Sven Zenker - Universitätsklinikum Bonn, Bonn, Deutschland
  • Giulio Napolitano - Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Deutschland
  • Matthias Schmid - Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Deutschland
  • Claudia Ose - Universitätsklinikum Essen AöR, Essen, Deutschland
  • Markus Deckert - Universitätsklinikum Essen AöR, Essen, Deutschland
  • Karl-Heinz Jöckel - Universitätsklinikum Essen AöR, Essen, Deutschland
  • Britta Böckmann - Universitätsklinikum Essen AöR, Essen, Deutschland
  • Michael Müller - Universität zu Köln, Köln, Deutschland
  • Andreas Beyer - Universität zu Köln, Köln, Deutschland
  • Martin Dugas - Westfälische Wilhelms-Universität Münster, Münster, Deutschland

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 62. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Oldenburg, 17.-21.09.2017. Düsseldorf: German Medical Science GMS Publishing House; 2017. DocAbstr. 203

doi: 10.3205/17gmds175, urn:nbn:de:0183-17gmds1751

Veröffentlicht: 29. August 2017

© 2017 Varghese et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: The BMBF funding scheme “Medical Informatics” intends to establish data integration centers at German University Hospitals. A systematic approach is needed to generate a high-resolution map of the local data landscape at each site. It shall contain a detailed catalog of patient data elements from local information systems, both for clinical and research purposes.

We present methodology and initial results on how a high-resolution data landscape was generated. It should list available data elements as well as their semantic coding from heterogeneous hospital information systems of one consortium including four different German University Hospitals (Bonn, Cologne, Essen, and Münster).

Methods: Electronic exports of hospital information system forms (including Agfa ORBIS, Cerner Medico and ICM Dräger) and case report forms of clinical studies from the aforementioned four sites were collected. Structured and non-structured form elements (individual items and itemgroups) were syntactically normalized according to CDISC ODM to define the structure of items, their data types, measurement units and semantic codes.

UCUM measurement units were used whenever applicable. Semantic coding was performed at the level of items and itemgroups using Unified Medical Language System (UMLS) codes according to established coding principles [1]. Code assignment is performed by a team led by two physicians and several medical students with experience in using established methods to maintain and improve UMLS coding uniformity [2].

After semantic annotation, UMLS-based semi-automatic semantic analyses were applied following a methodology similar to one previously applied to identify common data elements in Myeloid Leukemia [3], Breast Cancer and Prostate Cancer [4].

Results: More than 35,000 source data elements were collected and annotated with UMLS codes that provide cross-references to SNOMED CT and other terminologies.

These are provided on Medical-Data-Models [5] with open access for academic use and downloadable in various formats such as CSV, REDCap or FHIR questionnaires.

At two of the sites (Bonn and Münster), semantic coding of full item catalogues of their intensive care systems was performed. Without further code review by domain experts, UMLS-based analysis already yielded an overlap of 532 identical medical concepts (out of 3183 concepts in Bonn, resp. 3197 in Münster).

Discussion: UMLS-based identification of common medical concepts provides an essential preparatory step to describe patient data landscapes and to derive corresponding common data elements within the consortium. Future review by domain experts is necessary and could increase the set of common concepts that were not detected by initial UMLS coding.

Continuous development and alignment with national and international standards are necessary to identify matching data elements and to facilitate data exchange within the consortium and beyond.



Die Autoren geben an, dass kein Interessenkonflikt besteht.

Die Autoren geben an, dass kein Ethikvotum erforderlich ist.


References

1.
Varghese J, Dugas M. Frequency analysis of medical concepts in clinical trials and their coverage in MeSH and SNOMED-CT. Methods of information in medicine. 2015;54(1):83–92. DOI: 10.3414/ME14-01-0046 Externer Link
2.
Dugas M, Meidt A, Neuhaus P, Storck M, Varghese J. ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository. BMC medical research methodology. 2016;16:65. DOI: 10.1186/s12874-016-0164-9 Externer Link
3.
Varghese J, Holz C, Neuhaus P, Bernardi M, Boehm A, Ganser A, et al. Key Data Elements in Myeloid Leukemia. Studies in health technology and informatics. 2016;228:282–286.
4.
Krumm R, Semjonow A, Tio J, Duhme H, Burkle T, Haier Jorg, et al. The need for harmonized structured documentation and chances of secondary use - results of a systematic analysis with automated form comparison for prostate and breast cancer. Journal of biomedical informatics. 2014;51:86–99. DOI: 10.1016/j.jbi.2014.04.008 Externer Link
5.
Example form for Neurologic Status Assessment at the University Hospital Münster. https://medical-data-models.org/forms/18101 Accessed in April 2017. Externer Link