gms | German Medical Science

Deutscher Kongress für Orthopädie und Unfallchirurgie (DKOU 2023)

24. - 27.10.2023, Berlin

Artificial intelligence in medicine: An exam comparison between OpenAI’s ChatGPT and medical students in Germany

Meeting Abstract

  • presenting/speaker Jonas Roos - Uniklinikum Bonn, Orthopädie & Unfallchirurgie, Bonn, Germany
  • Mari Babasiz - Uniklinikum Bonn, Orthopädie & Unfallchirurgie, Bonn, Germany
  • Adnan Kasapovic - Uniklinikum Bonn, Orthopädie & Unfallchirurgie, Bonn, Germany
  • Alexander Franz - Uniklinikum Bonn, Orthopädie & Unfallchirurgie, Bonn, Germany
  • Eva-Maria Arndt - Uniklinikum Bonn, Orthopädie & Unfallchirurgie, Bonn, Germany
  • Robert Kaczmarczyk - TUM, Klinikum für Dermatologie, München, Germany

Deutscher Kongress für Orthopädie und Unfallchirurgie (DKOU 2023). Berlin, 24.-27.10.2023. Düsseldorf: German Medical Science GMS Publishing House; 2023. DocAB22-3284

doi: 10.3205/23dkou072, urn:nbn:de:0183-23dkou0725

Veröffentlicht: 23. Oktober 2023

© 2023 Roos et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Objectives: The minimum duration of study for a medical degree in Europe is six years, and in Germany, state exams are taken after the second, fifth, and sixth year. The first and second exams are multiple-choice tests, with a total of 320 questions asked in the second exam. In this examination, theoretical clinical knowledge is tested, and the questions are based on different clinical pictures. ChatGPT is an advanced artificial intelligence language model developed by OpenAI that uses a deep neural network with multiple layers to generate human-like responses. The authors conducted a retrospective analysis of the Fall 2022 second medical state exam in Germany, where ChatGPT was given a total of 320 multiple-choice tests in German taken from the learning platform Amboss. The aim of this study is to analyze how far the artificial intelligence ChatGPT can go in analyzing specific medical questions and how this can be used in the future for medical students.

Methods: We conducted a retrospective analysis of the Fall 2022 second medical state exam in Germany. Only a total of 277 questions were available to the authors. For comparison, the questions were then randomly padded with those from the Spring 2022 exam. Thus, ChatGPT was given a total of 320 multiple choice tests in German. The questions were taken from the learning platform Amboss (https://www.amboss.com/de).

Results and conclusion: The study found that ChatGPT answered only 58% of the questions correctly in a retrospective analysis of a German medical exam, falling below the pass rate of 60%.

However, it was noted that ChatGPT was mainly trained on English corpus and was not able to access previous questions, making it harder to answer cases with sequential questions. Despite this, ChatGPT could be a useful addition to exam preparation as it not only answers questions but also explains their content. The study suggested that fact-checking ChatGPT's responses would be necessary to avoid false medical knowledge in medical students.