gms | German Medical Science

64. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie

08. - 11.09.2019, Dortmund

Comparison of open source pipelines for processing of data in metabolomic research

Meeting Abstract

Suche in Medline nach

  • Miriam Sieg - Institut für Biometrie und Klinische Epidemiologie, Charité - Universitätsmedizin Berlin, Berlin, Germany
  • Janine Wiebach - Institut für Biometrie und Klinische Epidemiologie, Charité - Universitätsmedizin Berlin, Berlin, Germany
  • Jochen Kruppa - Institut für Biometrie und Klinische Epidemiologie, Charité - Universitätsmedizin Berlin, Berlin, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 64. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS). Dortmund, 08.-11.09.2019. Düsseldorf: German Medical Science GMS Publishing House; 2019. DocAbstr. 156

doi: 10.3205/19gmds070, urn:nbn:de:0183-19gmds0709

Veröffentlicht: 6. September 2019

© 2019 Sieg et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Metabolomics is the study of metabolites in a cell, tissue or an organism. A snapshot of all metabolites in a individual at a specific point in time is called metabolome. As metabolites are the last product in the chain of molecular biological processes and quite susceptible to physiological and environmental changes, the metabolome is closest to the biological phenotype at a certain time or outcome of interest. Hence, examination of metabolites allows to draw a direct conclusion to the biological phenotype.

One of the most commonly used techniques to acquire metabolite data is Liquid Chromatography-Mass Spectrometry (LCMS). The output and associated analysis are very complex and remain still a major challenge. Many successive processing steps have to be done to get from raw metabolite data to annotated metabolites. One crucial part is preprocessing of metabolite data to provide high quality metabolite annotation and correct calculation of the corresponding intensities. The number of open access published computational workflows in R for processing of metabolite data increased rapidly in the last few years. However, the decisions on workflow and parameter settings depend strongly on the researcher and their experience with this type of data analysis. Usually, different preprocessing workflows are tested on the biological data to decide which of them works "best" and has the highest accuracy. Because the biological truth is unknown, the decision on the used workflow is dependent on the researcher. As of now there is no gold standard. Therefore, it is not possible to determine which is the optimal workflow. But the importance of preprocessing is recognized and researchers are careful in choosing their preprocessing workflows. Though, an understanding on the actual impact of different computational preprocessing workflows and varying parameter settings on metabolite data and subsequent annotation is still missing.

In my talk I want to present a comparison of different preprocessing pipelines for metabolite data in medical research. Requirements for selection are the provision of pipelines and their parameter setting through a publication, the implementation in R and the public availability of the corresponding metabolite data. Each metabolite data set is then preprocessed with all selected pipelines and the resulting intensity, number and selection of peaks compared. Additionally, annotation of the preprocessed metabolite data is done to inspect the actual impact of the different preprocessing pipelines.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.