gms | German Medical Science

63. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

02. - 06.09.2018, Osnabrück

Towards a process model for representing clinical datasets for Asthma/COPD endotypes in the OMOP CDM

Meeting Abstract

  • Bor Ditewig - University of Freiburg, Faculty of Medicine, Institute for Medical Biometry and Statistics, Freiburg, Deutschland; University of Amsterdam, Academic Medical Center, Department of Medical Informatics, Amsterdam, Netherlands
  • Christian Maier - Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Deutschland
  • Christian Haverkamp - Medical Center, University of Freiburg, Freiburg, Deutschland
  • Ronald Cornet - University of Amsterdam, Academic Medical Center, Department of Medical Informatics, Amsterdam, Netherlands
  • Harald Binder - University of Freiburg, Faculty of Medicine, Institute for Medical Biometry and Statistics, Freiburg, Deutschland
  • Hans-Ulrich Prokosch - Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Deutschland
  • Martin Boeker - University of Freiburg, Faculty of Medicine, Institute for Medical Biometry and Statistics, Freiburg, Deutschland; Medical Center, University of Freiburg, Freiburg, Deutschland
  • Petar Horki - University of Freiburg, Faculty of Medicine, Institute for Medical Biometry and Statistics, Freiburg, Deutschland; Medical Center, University of Freiburg, Freiburg, Deutschland

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 63. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018. DocAbstr. 238

doi: 10.3205/18gmds134, urn:nbn:de:0183-18gmds1347

Veröffentlicht: 27. August 2018

© 2018 Ditewig et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Background: Asthma [1], [2] and chronic obstructive pulmonary disease (COPD) [3] are heterogeneous diseases in which endotypes defined by molecular mechanisms or treatment responses [1] exist. Cluster membership in endotypes could be indicated automatically using a classifier. A classifier will be developed in the German MIRACUM consortium, part of the grant Medical Informatics Initiative funded by the BMBF, hosting multiple university hospitals. The model will be trained with routine data from within the consortium. Currently, the necessary data is largely stored in silos, such as hospital information systems (HIS). Not all sites use the same HIS which brings a uniformity problem. Data which are not represented in a common data model and with shared semantics, make the exchange of information between healthcare organizations challenging and use of data for research purposes hard. As we are harmonizing the asthma and COPD data sets, availability of data for distributed analysis across sites is created.

Aim of the Study: The objective of this presentation is to illustrate a process model that was applied to develop a harmonized data model, OMOP CDM is chosen as a suitable representation, in which asthma and COPD data of all hospitals can be analysed. In this model, common data for differentiating between asthma and COPD endotypes should be accommodated. To establish semantic homogeneity, all data is coded with a standardized terminology.

Proposed Methods: This work covers the clinical considerations of in- and excluding data items and the development of the semantic model. To determine which data to include, a qualitative literature study is done. The proposed dataset, mainly lung function and laboratory results, is being discussed in expert meetings with physicians to proof the quality of the dataset. When the items are chosen, they are mapped from local code systems to Logical Observation Identifiers Names and Codes, LOINC. LOINC provides the most complete and detailed coverage of the chosen items. Talend Open Studio provides the necessary tools to extract, transform and load the data from different source systems (e.g. HL7, CSV, etc.) to a staging area. Here, the data warehousing tool i2b2 functions as a staging area in which the data are consolidated. As a last step, the data will be transferred from the staging area to the structured OMOP CDM.

Points for Discussion: As endotypes have not been defined in an unambiguous, generally accepted way, no predetermined minimal data set is available. However, experts from different consortial hospitals are consulted, which minimizes the risk of missing items and wrong interpretation of data. In this work, data originating from different German university hospitals will be used for the training of the classifier. With employing the OMOP CDM and LOINC as terminology, we demonstrated the feasibility to represent a complex clinical entity for classification purposes.

Acknowledgements: MIRACUM is funded by the Bundesministerium für Bildung und Forschung (BMBF) in the grant Medizininformatik-Initiative.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Anderson GP. Endotyping asthma: new insights into key pathogenic mechanisms in a complex, heterogeneous disease. Lancet. 2008 September 20;372:1107-19.
2.
Lötvall J, Akdis CA, Bacharier LB, Bjermer L, Casale TB, Custovic A, et al. Asthma endotypes: a new approach to classification of disease entities within the asthma syndrome. Journal of Allergy and Clinical Immunology. 2011 November 12;127(2):355-360.
3.
Fijačko V, Labor M, Fijačko M, Škrinjarić-Cincar S, Labor S, Dubravčić ID, et al. Predictors of short-term LAMA ineffectiveness in treatment naïve patients with moderate to severe COPD. Wiener klinische Wochenschrift. 2018 January;10:1-12.