gms | German Medical Science

GMDS 2014: 59. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

07. - 10.09.2014, Göttingen

Methodological aspects in integromics: integrating multiple omics data sets

Meeting Abstract

Search Medline for

  • K. Van Steen - Systems and Modeling Unit, Montefiore Institute, University of Liège, Liège, Belgium; Bioinformatics and Modeling, GIGA-R, University of Liège, Liège, Belgium

GMDS 2014. 59. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Göttingen, 07.-10.09.2014. Düsseldorf: German Medical Science GMS Publishing House; 2014. DocAbstr. Keynote Di III

doi: 10.3205/14gmds006, urn:nbn:de:0183-14gmds0060

Published: September 4, 2014

© 2014 Van Steen.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.


Outline

Text

The advent of high-throughput technologies including sequencers and array-based assays (expression, SNP, CpG) have caused the generation of humongous amounts of data often referred to as “Big Data”. The biological datasets are heterogeneous and often include gene expression, genotype, epigenome and other types of data that are referred to as “-omics” data. As a result, there is a strong effort across multi-disciplinary scientific communities to develop robust, computationally efficient and sensible data processing pipelines to effectively analyze “-omics” data in order to extract biologically and clinically relevant information – “useful knowledge”.

The enthusiasm of having access to vast amounts of information resources comes with a caveat. In contrast to single omics studies, integrated omics studies are extremely challenging. These challenges include protocol development for standardizing data generation and pre-processing or cleansing in integrative analysis contexts, development of computationally efficient analytic tools to extract knowledge from dissimilar data types to answer particular research questions, the establishment of validation and replication procedures, and tools to visualize results. However, from a personalized medicine point of view the anticipated advantages are believed to outweigh any difficulty related to “integromics”. The strong interest in the topic has already resulted in the emergence of new integrative cross-disciplinary techniques based on for instance kernel fusion, probabilistic Bayesian networks, correlation networks, statistical data-dimensionality reduction models, and clustering.

In this contribution, we will highlight the key steps involved in omics integration efforts and will summarize main analytic paths. We will then zoom in on a novel integrated analysis framework (based on genomic MB-MDR). This framework will be used as a red thread to discuss main issues, pitfalls and merits of integrated analyses. Unprecedented opportunities lie ahead!