gms | German Medical Science

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH)

08.09. - 13.09.2024, Dresden

Automating Crohn’s Disease Phenotyping: Comparing Natural Language Processing Approaches

Meeting Abstract

  • Susanne Ibing - Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, United States; Windreich Dept. of Artificial Intelligence & Human Health, Icahn School of Medicine at Mount Sinai, New York, United States
  • Linea Schmidt - Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, United States; Windreich Dept. of Artificial Intelligence & Human Health, Icahn School of Medicine at Mount Sinai, New York, United States
  • Florian Borchert - Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, United States
  • Julian Hugo - Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
  • Caroline Benson - The Henry D. Janowitz Division of Gastroenterology, Icahn School of Medicine at Mount Sinai, New York, United States
  • Allison Marshall - Department of Medicine, Mount Sinai Health System, New York, United States
  • Jellyana Peraza - The Henry D. Janowitz Division of Gastroenterology, Icahn School of Medicine at Mount Sinai, New York, United States
  • Judy H. Cho - Department of Pathology, Molecular, and Cell Based Medicine, Icahn School of Medicine at Mount Sinai, New York, United States
  • Erwin P. Böttinger - Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, United States; Windreich Dept. of Artificial Intelligence & Human Health, Icahn School of Medicine at Mount Sinai, New York, United States
  • Bernhard Y. Renard - Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, United States; Windreich Dept. of Artificial Intelligence & Human Health, Icahn School of Medicine at Mount Sinai, New York, United States
  • Ryan C. Ungaro - The Henry D. Janowitz Division of Gastroenterology, Icahn School of Medicine at Mount Sinai, New York, United States

Gesundheit – gemeinsam. Kooperationstagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Deutschen Gesellschaft für Sozialmedizin und Prävention (DGSMP), Deutschen Gesellschaft für Epidemiologie (DGEpi), Deutschen Gesellschaft für Medizinische Soziologie (DGMS) und der Deutschen Gesellschaft für Public Health (DGPH). Dresden, 08.-13.09.2024. Düsseldorf: German Medical Science GMS Publishing House; 2024. DocAbstr. 444

doi: 10.3205/24gmds076, urn:nbn:de:0183-24gmds0764

Veröffentlicht: 6. September 2024

© 2024 Ibing et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: The Montreal Classification (MC) captures the heterogeneity of Crohn's disease (CD) [1]. While the MC is an important tool for characterizing CD, its ascertainment for real-world studies requires manual chart review that is labor-intensive with limited scalability. We therefore used Electronic Health Records (EHR) for automated MC phenotyping, and, using this information, to identify CD incident cases.

Methods: We defined CD patients (n=7,624) from the Mount Sinai Health System EHR based on CD diagnosis codes and medications [2]. We then developed a pipeline for automated extraction of MC disease behavior and age at diagnosis from EHR narrative texts, using a rule-based approach based on the spaCy framework [3], and in-context learning using GPT-4. Two reviewers labeled a randomly selected clinical notes (n=150) and radiology reports (n=50) at sentence-level (n=15,390). The algorithms were evaluated for recall, precision, and F1-Scores. For each CD patient, the first coded CD diagnosis was considered as disease index date. We compared the index date with the prior encounter history and the extracted age at diagnosis to identify incident cases. To confirm the validity of the extracted incident case cohort and index date, we conducted manual chart review of 50 randomly selected cases and controls of the resulting cohorts.

Results: For the labeled data, the Cohen's kappa inter-annotator agreement was 0.84. For the detection of a stricturing or penetrating disease complication using clinical notes, the rule-based and GPT-4-based approaches yielded high recall, precision and F1-score values (rule-based: 1.00, 0.84, and 0.92; GPT-4: 0.95, 0.86, and 0.90, respectively), with similar performance between the two approaches. Perianal disease was extracted with a recall of 1.00, precision of 0.86, and F1-score of 0.93 using the rule-based approach, and 0.92 using GPT-4. For age at diagnosis, with a recall of 1.00, precision of 0.87, and F1-score of 0.93, GPT-4 performed slightly better than the rule-based approach with a recall of 0.81, precision of 0.88, and F1-score of 0.85. Upon achieving good performance, we were able to extract the age at diagnosis from the clinical text of 4,344 Crohn's disease patients of the Mount Sinai Health System and compared this information with the first coded patient encounters and CD diagnosis in the patients’ EHR, resulting in a sub-cohort of 229 Crohn's disease incident cases. With our phenotyping algorithm, we were able to identify cases and controls with high accuracy (0.96 and 0.95, respectively). In 83% of cases, the automatically identified first date of CD diagnosis was at most 180 days before the reviewed first date of diagnosis.

Discussion and conclusion: We demonstrate the feasibility of automatically extracting CD diagnosis and MC from clinical texts with good precision using EHR data. This approach can facilitate data extraction for real-world research at large scale and demonstrated utility in identifying newly diagnosed patients with CD. The evaluated approaches were based on rules and a general large language model, GPT-4. Performance of domain-specific Large Language Models such as MEDITRON [4] or BioMistral [5] may be of interest.

Competing interests: RCU has served as a consultant and/or advisory board member for AbbVie, Bristol Myers Squibb, Celltrion, Inotrem, Lilly, Janssen, Pfizer, Roivant, Takeda. The remaining authors declare no conflict of interest.

The authors declare that a positive ethics committee vote has been obtained.

The contribution has already been published: S. Ibing, L. Schmidt, F. Borchert, J. Hugo, C. Benson, A. Marshall, J. Peraza, B.Y. Renard, J.H. Cho, E.P. Böttinger, R.C. Ungaro: Automating Crohn’s disease phenotyping: a natural language processing approach. Digestive Disease Week 2024, Gastroenterology.


References

1.
Silverberg MS, Satsangi J, Ahmad T, Arnott IDR, Bernstein CN, Brant SR, et al. Toward an integrated clinical, molecular and serological classification of inflammatory bowel disease: report of a Working Party of the 2005 Montreal World Congress of Gastroenterology. Canadian journal of gastroenterology. 2005;19 Suppl A:5A-36A.
2.
Ibing S, Cho JH, Böttinger EP, Ungaro RC. Second-Line Biologic Therapy Following Tumor Necrosis Factor Antagonist Failure: A Real-World Propensity Score-Weighted Analysis. Clinical Gastroenterology and Hepatology. 2023 Sep 1;21(10):2629–38.
3.
Schmidt L, Ibing S, Borchert F, Hugo J, Marshall A, Peraza J, et al. Extraction of Crohn’s Disease Clinical Phenotypes from Clinical Text Using Natural Language Processing [Preprint]. medRxiv. 2023. DOI: 10.1101/2023.10.16.23297099 Externer Link
4.
Chen Z, Cano AH, Romanou A, Bonnet A, Matoba K, Salvi F, et al. MEDITRON-70B: Scaling Medical Pretraining for Large Language Models [Preprint]. arXiv. 2023. DOI: 10.48550/arXiv.2311.16079 Externer Link
5.
Labrak Y, Bazoge A, Morin E, Gourraud PA, Rouvier M, Dufour R. BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains [Preprint]. arXiv. 2024. DOI: 10.48550/arXiv.2402.10373 Externer Link