gms | German Medical Science

67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e. V. (TMF)

21.08. - 25.08.2022, online

Predicting Clinical Outcomes from Emergency Department Electronic Health Records: Feasibility of Machine Learning Models

Meeting Abstract

  • Felix Knispel - Institut of Medical Informatics, Medical Faculty, RWTH Aachen University, Aachen, Germany
  • Jonas Bienzeisler - Institut of Medical Informatics, Medical Faculty, RWTH Aachen University, Aachen, Germany
  • Alexander Kombeiz - Institut of Medical Informatics, Medical Faculty, RWTH Aachen University, Aachen, Germany
  • Raphael W. Majeed - Institut of Medical Informatics, Medical Faculty, RWTH Aachen University, Aachen, Germany
  • Rainer Röhrig - Institut of Medical Informatics, Medical Faculty, RWTH Aachen University, Aachen, Germany
  • Ekaterina Kutafina - Institut of Medical Informatics, Medical Faculty, RWTH Aachen University, Aachen, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 67. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS), 13. Jahreskongress der Technologie- und Methodenplattform für die vernetzte medizinische Forschung e.V. (TMF). sine loco [digital], 21.-25.08.2022. Düsseldorf: German Medical Science GMS Publishing House; 2022. DocAbstr. 110

doi: 10.3205/22gmds012, urn:nbn:de:0183-22gmds0128

Veröffentlicht: 19. August 2022

© 2022 Knispel et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: The emergency department is often the first source of patient data in hospitals. It is therefore the first opportunity to make accurate predictions about patient outcomes such as hospital length of stay (LOS). Those predictions are of high importance as they could significantly improve resource planning processes.

While machine learning models present great potential for this domain of research, previous work usually focuses on specific populations of patients, making general applicability of such models in clinical routine very limited. Robust prediction across different hospitals requires the legally and technically complex sharing of large amounts of routine medical data, i.e., electronic health records (EHR).

In Germany, the “Alliance for Information and Communication Technology in Intensive Care and Emergency Medicine” operates the AKTIN Emergency Department Data Registry, providing multi-hospital EHR data fit for research purposes [1], [2].

As part of the usecase FORECAST-LOSt of the AKTIN-EZV project within the BMBF-funded Network University Medicine (NUM), we explored the potential of data contributed to the AKTIN registry by developing predictive machine learning models for patient outcomes based on single-site EHR data.

Methods: Data from the emergency department medical record (AKTIN data) were combined with data from subsequent hospital treatment collected for billing purposes (P21 data). For analysis, we included 28160 cases from one hospital site. All predictors such as age, triage, gender, vital scores, pregnancy status, or method of transportation were extracted from the AKTIN data. Endpoints, namely hospital LOS, intensive care unit LOS, ventilation time, and survival, were extracted from P21 data. Features were appropriately encoded, missing values imputed using K-Nearest-Neighbor imputation, and dimensionality of diagnoses was reduced using factor analysis.

Data was standardized before usage in the models. After extensive testing, a machine learning model combining a Random Forest and a Gradient Boosting Machine in a voting process was chosen. Models were evaluated using 5-fold cross validation. Moreover, we established a baseline model to compare LOS regression task performance: average LOS as a function of age, gender, and primary diagnosis group according to G-DRG [3] was utilized.

Results: R2 values of 0.58, 0.67, and 0.60 were achieved for the LOS, ICU-LOS, and ventilation regression tasks respectively. Crucial to this performance was the usage of the full diagnosis information: without it, R2 dropped to as little as 0.17, 0.13, and 0.05. Yet, the baseline model had an R2 model fit of -0.24. AUROC 0.94 and 70.4% balanced accuracy was achieved by our model when predicting survival of the patient in the hospital.

Discussion and conclusion: Our preliminary results from one hospital showed promising perspectives in research-oriented multi-hospital data aggregation for improved prediction of important patient outcomes. Model performance remained suboptimal due to limitations in amount and detail of available data, but even at this stage, the advantages of ML models over the average LOS according to G-DRG are visible.

The reported work showed the importance of data curation, diagnosis information, dimensionality reduction and usage of ensemble models for more stable performance.

Work on aggregated models using datasets from multiple hospitals is ongoing.

The authors declare that they have no competing interests.

The authors declare that a positive ethics committee vote has been obtained.


References

1.
Ahlbrandt J, et al. Balancing the need for big data and patient data privacy – an IT infrastructure for a decentralized emergency care research database. In: e-Health–For Continuity of Care. Proceedings of MIE2014. IOS Press; 2014. p. 750-754.
2.
Brammen D, et al. Das AKTIN-Notaufnahmeregister – kontinuierlich aktuelle Daten aus der Akutmedizin. Med Klin Intensivmed Notfmed. 2022;117(1):24–33.
3.
Gesundheitsberichterstattung des Bundes. Indikator 70 der ECHI shortlist: Durchschnittliche Verweildauer nach ausgewählten Diagnosen. [Accessed 2022 Mar 30]. Available from: https://www.gbe-bund.de/gbe/!pkg_olap_tables.prc_set_page?p_uid=gast&p_aid=86786473&p_sprache=D&p_help=2&p_indnr=815&p_ansnr=72640747&p_version=5&D.002=1000002&D.003=1000004/ Externer Link