gms | German Medical Science

67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

21.08. - 25.08.2022, online

Cancer Prediction on OMOP CDM – a Rapid Review

Meeting Abstract

  • Najia Ahmadi - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Yuan Peng - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Markus Wolfien - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Michele Zoch - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany
  • Martin Sedlmayr - Institut für Medizinische Informatik und Biometrie, Medizinische Fakultät Carl Gustav Carus der Technischen Universität Dresden, Dresden, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 21.-25.08.2022. Düsseldorf: German Medical Science GMS Publishing House; 2022. DocAbstr. 23

doi: 10.3205/22gmds032, urn:nbn:de:0183-22gmds0325

Veröffentlicht: 19. August 2022

© 2022 Ahmadi et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: Omics technologies have led to significant advances in identifying novel disease-associated mutations and generated large amounts of data, which, in conjunction with clinical data, is highly useful in deriving population- and patient-level predictions [1]. However, data harmonization is a vital step for assessing events and outcomes associated with heterogeneous patient data. The Observational Medical Outcomes Partnership (OMOP) is an internationally established research data repository for which the genomic vocabulary extension was introduced in 2020 [2], [3], [4]. This review aims to evaluate the current potential of OMOP for cancer prediction; especially, concerning research potential, coverage of vocabulary, and tools for predictive analyses. The research question is: “given the existing genomic vocabulary; to what extent/which predictive models are used in cancer prediction on OMOP?”

Methods: We screened PubMed, BMC, JAMIA, JBI, PLOS ONE, Hindawi, Elsevier, Sage, Springer, Science Direct, Nature, IEEE, and BMC Med. Inform. Decis. Mak., between 2016-2021, using the “cancer AND ((machine learning) OR (prediction) OR (algorithm)) AND ((OHDSI) OR (OMOP)” search string and investigated the utilized predictive models/tools in the articles.

Results: We found 204 articles of, which, only three make use of predictive algorithms and fulfill our criteria [5], [6], [7], [8]. All three articles transform their dataset to OMOP using ICD-9-CM, ICD-10-CM, SNOMED-CT, and LOINC and design their AI-based analysis on it (Table 1 [Tab. 1]). A wide range of models starting from tree-based, e.g., Random Forest (RF), Gradient Boosting Machine (GBM), to other regression and classification methods including linear regression, lasso regression, Support Vector Machines (SVM), and k-nearest neighbors are used.

Discussion: For Cancer Precision Medicine, the need for targeted treatment protocols increases, and predictive analyses may provide additional information [9]. The OMOP genomic vocabulary is a step forward toward the harmonization of genomic data, which potentially enables analyses of both clinical and sequencing data. However, all three articles included in this review were published before 2020, meaning that they are not using the extension. So far there is no evidence of how predictive methods application to oncological data represented in OMOP via genomics vocabularies.

Conclusion: This review provides a first insight into the usability of OMOP CDM for cancer prediction. Using comprehensive genomics vocabularies, oncology data can be harmonized in OMOP and this may lead to essential advancements in this field. In prospective future works, we will evaluate the application of Patient-Level Prediction [10], as a predictive AI framework, for Cancer Precision Medicine on OMOP CDM with the genomic vocabulary extension. This focus will investigate whether existing modules in PLP can handle real-world oncological data using genomics vocabularies.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Baek B, Lee H. Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data. Sci Rep. 2020 Nov 3;10(1):18951.
2.
Genomic Data Harmonization through the OMOP Standardized Vocabularies – OHDSI [Internet]. [cited 2022 May 16]. Available from: https://www.ohdsi.org/2020-global-symposium-showcase-13/ Externer Link
3.
Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Stud Health Technol Inform. 2015;216:574–8.
4.
Garza M, Del Fiol G, Tenenbaum J, Walden A, Zozus MN. Evaluating common data models for use with a longitudinal community registry. J Biomed Inform. 2016 Dec;64:333–41.
5.
Felmeister AS, Waanders AJ, Leary SES, Stevens J, Mason JL, Teneralli R, et al. Preliminary exploratory data analysis of simulated national clinical data research network for future use in annotation of a rare tumor biobanking initiative. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 2017 Nov 13-16; Kansas City, MO, USA. IEEE; 2017. p. 2098–104.
6.
Meystre SM, Heider PM, Kim Y, Aruch DB, Britten CD. Automatic trial eligibility surveillance based on unstructured clinical data. International Journal of Medical Informatics. 2019 Sep 1;129:13–9.
7.
Seneviratne MG, Banda JM, Brooks JD, Shah NH, Hernandez-Boussard TM. Identifying Cases of Metastatic Prostate Cancer Using Machine Learning on Electronic Health Records. AMIA Annu Symp Proc. 2018;2018:1498–504.
8.
Ahmadi N, Peng Y, Wolfien M, Zoch M, Sedlmayr M. Cancer Prediction on OMOP CDM – A Rapid Review / Study Registration. 2022 May 20 [cited 2022 May 23]. Available from: https://osf.io/su69b/ Externer Link
9.
Cirillo D, Valencia A. Big data analytics for personalized medicine. Curr Opin Biotechnol. 2019 Aug;58:161–7.
10.
Rijnbeek P, Reps J. Chapter 13 Patient-Level Prediction. In: Observational Health Data Sciences and Informatics, editor. The Book of OHDSI. 2021 [cited 2022 Jan 25]. Available from: https://ohdsi.github.io/TheBookOfOhdsi/ Externer Link