Artikel
Artificial intelligence in medicine: An exam comparison between OpenAI’s ChatGPT and medical students in Germany
Suche in Medline nach
Autoren
Veröffentlicht: | 23. Oktober 2023 |
---|
Gliederung
Text
Objectives: The minimum duration of study for a medical degree in Europe is six years, and in Germany, state exams are taken after the second, fifth, and sixth year. The first and second exams are multiple-choice tests, with a total of 320 questions asked in the second exam. In this examination, theoretical clinical knowledge is tested, and the questions are based on different clinical pictures. ChatGPT is an advanced artificial intelligence language model developed by OpenAI that uses a deep neural network with multiple layers to generate human-like responses. The authors conducted a retrospective analysis of the Fall 2022 second medical state exam in Germany, where ChatGPT was given a total of 320 multiple-choice tests in German taken from the learning platform Amboss. The aim of this study is to analyze how far the artificial intelligence ChatGPT can go in analyzing specific medical questions and how this can be used in the future for medical students.
Methods: We conducted a retrospective analysis of the Fall 2022 second medical state exam in Germany. Only a total of 277 questions were available to the authors. For comparison, the questions were then randomly padded with those from the Spring 2022 exam. Thus, ChatGPT was given a total of 320 multiple choice tests in German. The questions were taken from the learning platform Amboss (https://www.amboss.com/de).
Results and conclusion: The study found that ChatGPT answered only 58% of the questions correctly in a retrospective analysis of a German medical exam, falling below the pass rate of 60%.
However, it was noted that ChatGPT was mainly trained on English corpus and was not able to access previous questions, making it harder to answer cases with sequential questions. Despite this, ChatGPT could be a useful addition to exam preparation as it not only answers questions but also explains their content. The study suggested that fact-checking ChatGPT's responses would be necessary to avoid false medical knowledge in medical students.