gms | German Medical Science

62. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

17.09. - 21.09.2017, Oldenburg

Data Aggregation Patterns Between a Hospital Information System and the Database of a Large Clinical Study

Meeting Abstract

  • Mathias Kaspar - Comprehensive Heart Failure Center, University Hospital Würzburg, Würzburg, Deutschland
  • Georg Fette - Comprehensive Heart Failure Center, University Hospital Würzburg, Würzburg, Deutschland; Chair of Computer Science VI, Würzburg University, Würzburg, Deutschland
  • Maximilian Ertl - Service Center Medical Informatics, University Hospital Würzburg, Würzburg, Deutschland
  • Georg Dietrich - Chair of Computer Science VI, Würzburg University, Würzburg, Deutschland
  • Jonathan Krebs - Chair of Computer Science VI, Würzburg University, Würzburg, Deutschland
  • Stefan Störk - Comprehensive Heart Failure Center, University Hospital Würzburg, Würzburg, Deutschland
  • Frank Puppe - Chair of Computer Science VI, Würzburg University, Würzburg, Deutschland

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 62. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Oldenburg, 17.-21.09.2017. Düsseldorf: German Medical Science GMS Publishing House; 2017. DocAbstr. 089

doi: 10.3205/17gmds034, urn:nbn:de:0183-17gmds0343

Veröffentlicht: 29. August 2017

© 2017 Kaspar et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: Clinical studies require to document large sets of data that is collected within electronic data capture systems (EDC), but often after manual documentation. However, large amounts of study data are frequently available as data documented by routine care within the hospital information system (HIS). Thus, it would “only” require a data transformation and transferal into the EDC system [1]. But what are the transformation steps necessary to utilize usual data from HIS (single and tabular data elements) in all its detail to fill EDC system forms? This work categorizes the transformation experiences from the Acute Heart Failure Registry (AHFR).

Methods: A clinical data warehouse (DWH) has been developed and deployed at the University Hospital of Würzburg, providing pseudonymized access to data of our HIS. Since the data transformations for the AHFR not only require simple transformations but also very specific aggregations, a comprehensive R interface was developed and described previously [2].

Results: The setup was used to import AHFR study patients (n=800) into the EDC system. During the development of the transfer pipeline, we had to solve several transformation problems, which we grouped into basic transformation patterns (i.e., A to C).

A. Single data elements in and out. The simplest example is the direct adoption of single data elements from a base system without the need for any further transformation of the data itself. Such data elements typically have no direct relation to other data elements except that they belong to the same patient and/or patient case.

B. Multiple related data elements in and single data elements out. More complexity exists if the input data elements are linked to each other. This might be the case if a data element has multiple attributes and if this element (and its attributes) occurs multiple times during a hospitalization or study event (e.g. a single day).

C. Multiple related data elements in and out. Most complex transformations arose from the need to transform multiple input data elements that are linked to each other to multiple output data elements that are linked to each other, which might include additionally custom calculations in-between. Such data elements typically occur multiple times during a single hospital visit.

Discussion: This work exemplifies patterns of transformations and aggregations required to transfer data between a HIS and an EDC system in the context of a large clinical study. A very basic requirement a DWH needs to fulfill in order to handle more complex and related data and to prefill an EDC system as described above, is the linkage between multiple data elements on the level of documents. I2B2 for examples supports a similar functionality with its “modifiers” concept [3]. A key question in any DWH setting is the decision where to aggregate the data. This could be done between the HIS and DWH, meaning that the person using the DWH already has the final variables. This would also reduce the complexity of DWH query systems. However, even with the simplest data, conflicting requirements may occur with different studies.



Die Autoren geben an, dass kein Interessenkonflikt besteht.

Die Autoren geben an, dass kein Ethikvotum erforderlich ist.


References

1.
Prather JC, Lobach DF, Goodwin LK, Hales JW, Hage ML, Hammond WE. Medical data mining: knowledge discovery in a clinical data warehouse. Proceedings of the AMIA Annual Fall Symposium. 1997;101-105.
2.
Kaspar M, Ertl M, Fette G, Dietrich G, Toepfer M, Angermann C, Störk S, Puppe F. Data Linkage from Clinical to Study Databases via an R Data Warehouse User Interface. Experiences from a Large Clinical Follow-up Study. Methods Inf Med. 2016;55:381-6.
3.
London JW, Chatterjee D. Implications of Observation-Fact Modifiers to i2b2 Ontologies. In: 2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW). 2011. p.29–30.