gms | German Medical Science

GMDS 2014: 59. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

07. - 10.09.2014, Göttingen

Data for Systems Medicine

Meeting Abstract

  • M. Ganzinger - Universität Heidelberg, Heidelberg
  • H. Goldschmidt - Universität Heidelberg, Heidelberg
  • P. Knaup-Gregori - Universität Heidelberg, Heidelberg

GMDS 2014. 59. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Göttingen, 07.-10.09.2014. Düsseldorf: German Medical Science GMS Publishing House; 2014. DocAbstr. 188

doi: 10.3205/14gmds100, urn:nbn:de:0183-14gmds1003

Veröffentlicht: 4. September 2014

© 2014 Ganzinger et al.
Dieser Artikel ist ein Open Access-Artikel und steht unter den Creative Commons Lizenzbedingungen (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.de). Er darf vervielfältigt, verbreitet und öffentlich zugänglich gemacht werden, vorausgesetzt dass Autor und Quelle genannt werden.


Gliederung

Text

Introduction: Systems medicine is a new approach for delivering personalized medicine [1], [2]. It significantly relies on the support of information technology during diagnosis, treatment planning and therapy. This is achieved by developing computer models for diseases based on pools of phenotypic and genotypic data.

The models are applied to the data of new patients with the aim of supporting physicians to find the most appropriate treatment path. The patients’ personal treatment goals are taken into consideration during this process. For example, patient and physician might decide on a treatment line that probably will not provide the best possible overall survival rate, but is expected to have only mild side effects and thus optimize quality of life.

For the establishment of valid models, collections of data on a patient cohort for a specific disease are a valuable basis for systems medicine. However, these data often originate from various sources and consequently have been collected for a different purpose and have therefore to be prepared for the use in systems medicine.

In this manuscript we describe the data preparation approach developed for the research project “Clinically applicable, omics-based assessment of survival, side effects, and targets in multiple myeloma” (CLIOMMICS). The research goal is the development of a systems medicine IT platform to facilitate use of “omics”, molecular, and conventional clinical data to provide personalized myeloma treatment increasing efficacy and at the same time reducing side effects.

Multiple myeloma is a rarely curable malignant disease of the bone marrow. It is relatively rare (4-6 per 100,000 people per year) and affects the older population (median age: 65-70 years) [3]. Thus, multiple myeloma is the disease example for the development of our systems medicine platform.

Methods and Materials: Our clinical partners have collected high quality clinical data on patients diagnosed with multiple myeloma for more than a decade. Standard clinical data include fields like diagnosis, treatment, progression and survival. Data are complemented with fluorescence in situ hybridization (FISH) data to capture chromosomal aberrations. Further, for some of the patients also gene expression profiles (GEP), RNA sequence data (RNAseq), or single-nucleotide polymorphism (SNP) data sets were created. For a consistent view on the patient cases, these data need to be integrated.

Results: For CLIOMMICS we defined a cohort of patients diagnosed with multiple myeloma that is well described with clinical data. For about half of them also FISH data to estimate the chromosomal disposition of their malignant plasma cells is available and correlated to the clinical data. Likewise, for another sub-cohort gene expression profiles based on microarray technology are integrated. While SNP datasets for patients are already part of the data pool, RNAseq data are still under preparation. We expect data for several hundred patients of the CLIOMMICS cohort to be integrated by the end of the year 2014. To ensure compliance with privacy laws, a pseudonymization concept is developed and established [4].

Each data source is checked for data consistency by merging multiple entries per patient. Next, a strategy is developed to link patient records across data sources. If possible, numeric identifiers like the patient identifier are used. Depending on the data source, other identifiers or patients’ demographic data are used as fallback. Since the treatment results of new cases will become part of the systems medicine base dataset, it is necessary to automate the data integration process. For this, we establish transformation jobs with an open source process management tool.

The data pool is currently being examined and several approaches for the defining of disease models are investigated. As more patients are treated, more data become available for the data pool, either by new patients or by follow up data for existing patients. These data are loaded by our data integration pipeline into the pool. These data also affect the models: with more data they can be refined to represent the properties of the disease in a better way.

Discussion: Using the multiple myeloma as a reference disease, we show how existing heterogeneous data collections can be prepared for systems medicine. For a comprehensive view on the disease as well as individual patients, it is necessary to consider phenotypic as well as genotypic data. The latter provide additional challenges, since they provide a high number of variables making analysis complex. Also, data volumes are high and computational analysis pipelines tend to run very long.

While there are similarities to the extract-transform-load (ETL) processes know from data integration for data warehouses, the requirements for the data in systems medicine are different [5]. In data warehouses, the aim is typically to identify one or more sub-cohorts of the patient collective for studies. In systems medicine, however, it is necessary to draw conclusions for each individual patient on the basis of prior experience which is documented within the data pool.

Based on this data, we develop a prototype IT system to bring systems medicine into clinical practice. The system contains disease specific models that are designed to represent historic treatment outcomes of the corresponding institution as well as new developments in the description of the disease. We are planning to implement a process leading to a model which is adequately designed, parameterized and validated to represent the reference disease. Data on the treatment of new patients can be added to the data pool, allowing for cyclic improvement of models and parameters.

While our system is loaded with data and models specific for multiple myeloma, it is our intention to provide software that can be adapted to other diseases as well. In this case, a disease specific data pool and models have to be created. Thus, the architecture of the system can become a blueprint for a systems medicine component for clinical routine care.

CLIOMMICS is funded by the German Federal Ministry of Education and Research (BMBF) by grant 01ZX1309A.


References

1.
Wolkenhauer O, Auffray C, Jaster R, Steinhoff G, Dammann O. The road from systems biology to systems medicine. Pediatr Res. 2013;73(4 Pt 2):502–7.
2.
Younesi E, Hofmann-Apitius M. From integrative disease modeling to predictive, preventive, personalized and participatory (P4) medicine. EPMA J. 2013;4(1):23.
3.
Harousseau J, Moreau P. Autologous hematopoietic stem-cell transplantation for multiple myeloma. N Engl J Med. 2009; 360(25):2645–54.
4.
Reng C, Debold P, Specker C, Pommerening K. Generische Lösungen der TMF zum Datenschutz für die Forschungsnetze in der Medizin. Berlin: Med. Wiss. Verl.-Ges; 2006. (Schriftenreihe der Telematikplattform für Medizinische Forschungsnetze; 1).
5.
Ganslandt T, Mate S, Helbing K, Sax U, Prokosch HU. Unlocking Data for Clinical Research – The German i2b2 Experience. Appl Clin Inform. 2011;2(1):116–27.