gms | German Medical Science

63. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

02. - 06.09.2018, Osnabrück

iSEEing is believing: exploring RNA-seq data, the ideal way

Meeting Abstract

Suche in Medline nach

  • Federico Marini - Center for Thrombosis and Hemostasis Mainz (CTH), Mainz, Deutschland; Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Mainz, Deutschland

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 63. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018. DocAbstr. 147

doi: 10.3205/18gmds085, urn:nbn:de:0183-18gmds0856

Veröffentlicht: 27. August 2018

© 2018 Marini.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: Data exploration is critical to the comprehension of large biological datasets obtained by high-throughput assays such as sequencing. Prior to rigorous statistical analyses, novel data-driven hypotheses can be generated, enabling diagnosis of potential problems in the multi-faceted analysis workflow that follows, from quality control to downstream interpretation of the results, e.g. after detecting differentially expressed genes in the experimental conditions of interest.

Visualization and processing of the data in an intuitive and interactive interface is crucial. However, most existing tools for interactive visualization are limited to specific assays or analyses, and lack support for reproducible analysis [1].

Methods: The proposed approach addresses the steps of Exploratory Data Analysis and Differential Expression analysis with a series of R/Bioconductor packages (pcaExplorer, ideal, and iSEE), designed to enable efficient data exploration for a wide number of researchers, exploiting data architectures from the open-source Bioconductor project, and implemented in the Shiny framework.

Results: Key features include: reproducible reports, seamlessly generated a result of the user’s exploration; guided tours of the web applications, to learn step-by-step the salient features of the user interface and of the data; dynamically linked charts, transmitting the information across panels; automatic storage of the exact R code to generate every plot.

Diskussion: The utility and flexibility of this combination of packages can be demonstrated by applying it to analyze and explore a range of real transcriptomics datasets, which can additionally be showcased by deploying dedicated server instances of the developed packages. A practical example for single-cell RNA-sequencing (containing 4000 PBMCs from 10X Genomics) is available at http://shiny.imbei.uni-mainz.de:3838/iSEE_PBMC4k/, where an instance of iSEE runs with a dedicated preset of linked panels, and a customized tour allows scientists to follow the presented steps and findings, as well as further explore the data.

Availability: pcaExplorer, ideal, and iSEE are publicly available as R packages from the open-source Bioconductor project:

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Marini F, Binder H. Development of Applications for Interactive and Reproducible Research: a Case Study. Genomics and Computational Biology. 2016;3(1):e39. DOI: 10.18547/gcb.2017.vol3.iss1.e39 Externer Link