gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

Machine Learning Pipeline for predicting Overall Survival Status and Survival Months in Pancreatic Cancer

Meeting Abstract

  • Vaishnavi Sirul Velaga - University Medical Center Göttingen, Department of Medical Informatics, Göttingen, Germany
  • Sophia Rheinländer - University Medical Center Göttingen, Department of Medical Informatics, Göttingen, Germany
  • Nils Beyer - University Medical Center Göttingen, Department of Medical Informatics, Göttingen, Germany
  • Elisabeth Hessmann - University Medical Center Göttingen, Clinic for Gastroenterology, Gastrointestinal Oncology and Endocrinology, Göttingen, Germany
  • Martin Haubrock - University Medical Center Göttingen, Department of Medical Bioinformatics, Göttingen, Germany
  • Ulrich Sax - University Medical Center Göttingen, Department of Medical Informatics, Göttingen, Germany

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 730

doi: 10.3205/24gmds108, urn:nbn:de:0183-24gmds1083

Published: September 6, 2024

© 2024 Velaga et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Pancreatic cancer ranks as the 12th most common cancer globally [1]. However, its diagnosis often occurs at advanced stages due to the absence of early symptoms, resulting in a low 5-year survival rate of about 8% [2]. Consequently, there is an urgent need for improved diagnostic methods to detect the cancer earlier. Cancer is caused by gene mutations, making them grow out of hand. By analyzing gene expression, we can predict and understand cancer better. In this study, we focus on the power of gene expression profiles to predict overall survival (OS) status and months. OS status offers valuable prognostic information to guide treatment decisions. Through the integration of machine learning (ML) techniques, we aim to develop strong prediction models using gene expression data. These models will help clinicians to customize treatments for pancreatic cancer patients.

Methods: In this study, we utilized two datasets: the Pancreatic Adenocarcinoma PanCancer dataset obtained from the Cancer Genome Atlas (TCGA) [3] through cBioPortal, and data from CRU5002 clinical research unit at the University Medical Center Göttingen. The TCGA dataset comprised 20,531 gene expression profiles from 179 patients, while the CRU dataset contained 19,671 gene data for 119 patients. We trained various ML classification and regression models on the TCGA data. Subsequently, we validated the performance of these models on the CRU dataset using well-established evaluation metrics to predict OS status and months. Additionally, we conducted correlation analysis between clinical and gene expression data, and statistical methods were applied to identify key genes influencing OS months and status prediction.

Expected results: We anticipate our study to yield valuable insights into predicting OS status and months in pancreatic cancer patients. By leveraging gene expression profiles and clinical data, we aim to develop robust ML models. These models are expected to accurately predict OS status and estimate survival months. Through careful analysis, we anticipate key genetic markers that play significant roles in determining survival in pancreatic cancer. Moreover, our findings may identify patient comparable subgroups benefiting from targeted therapies [4]. We also think of integrating additional clinical data to enhance predictive accuracy and deepen biological understanding. This broader approach may reveal new patterns, suggesting personalized treatment strategies for pancreatic cancer patients.

Discussion: The development of ML models to predict OS status and estimate OS months based on gene expression data is noteworthy. Nevertheless, these models have the potential to change how clinicians approach treatment decisions, allowing for more personalized and effective interventions. Furthermore, the identification of key genes associated with OS status and months prediction may provide valuable insights into the genetic understanding of the cancer, clearing the path for targeted therapies and improved patient care.

??????Conclusion: In conclusion, our study demonstrates the power to analyze gene expression data using ML that can help physicians or clinicians better understand and treat pancreatic cancer. By accurately predicting OS status and estimating OS months, our research contributes to the advancement of diagnostic and prognostic approaches in pancreatic cancer.

Acknowledgements: This work is funded by DFG within the CRU5002 (426671079).

The authors declare that they have no competing interests.

The authors declare that a positive ethics committee vote has been obtained.


References

1.
International WCRF. Pancreatic Cancer Statistics [Internet]. [cited 2024-04-30]. Available from: https://www.wcrf.org/cancer-trends/pancreatic-cancer-statistics/#:~:text=pancreatic%20cancer%20data-,Pancreatic%20cancer%20is%20the%2012th%20most%20common%20cancer%20worldwide.,most%20common%20cancer%20in%20women External link
2.
Lu W, Li N, Liao F. Identification of Key Genes and Pathways in Pancreatic Cancer Gene Expression Profile by Integrative Analysis. Genes (Basel). 2019 Aug 13;10(8):612. DOI: 10.3390/genes10080612 External link
3.
Pancreatic Adenocarcinoma (TCGA, PanCancer Atlas) [Internet]. [cited 2024-04-30]. Available from: https://www.cbioportal.org/study/summary?id=paad_tcga_pan_can_atlas_2018 External link
4.
Collisson EA, Bailey P, Chang DK, Biankin AV. Molecular subtypes of pancreatic cancer. Nat Rev Gastroenterol Hepatol. 2019 Apr;16(4):207-220. DOI: 10.1038/s41575-019-0109-y External link