gms | German Medical Science

GMS Journal for Medical Education

Gesellschaft für Medizinische Ausbildung (GMA)

ISSN 2366-5017

Guessing right – whether and how medical students give incorrect reasons for their correct diagnoses

article Virtual Patients

  • corresponding author Leah T. Braun - Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Medizinische Klinik und Poliklinik IV, München, Germany; Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Institut für Didaktik und Ausbildungsforschung in der Medizin, München, Germany
  • author Katharina F. Borrmann - Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Institut für Didaktik und Ausbildungsforschung in der Medizin, München, Germany
  • author Christian Lottspeich - Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Medizinische Klinik und Poliklinik IV, München, Germany
  • author Daniel A. Heinrich - Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Medizinische Klinik und Poliklinik IV, München, Germany
  • author Jan Kiesewetter - Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Institut für Didaktik und Ausbildungsforschung in der Medizin, München, Germany
  • author Martin R. Fischer - Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Institut für Didaktik und Ausbildungsforschung in der Medizin, München, Germany
  • author Ralf Schmidmaier - Ludwig-Maximilians-University (LMU) Munich, Klinikum der Universität München, Medizinische Klinik und Poliklinik IV, München, Germany

GMS J Med Educ 2019;36(6):Doc85

doi: 10.3205/zma001293, urn:nbn:de:0183-zma0012937

This is the English version of the article.
The German version can be found at:

Received: January 28, 2019
Revised: May 4, 2019
Accepted: June 6, 2019
Published: November 15, 2019

© 2019 Braun et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at


Background: Clinical reasoning is one of the central competencies in everyday clinical practice. Diagnostic competence is often measured based on diagnostic accuracy. It is implicitly assumed that a correct diagnosis is based on a proper diagnostic process, although this has never been empirically tested. The frequency and nature of errors in students’ diagnostic processes in correctly solved cases was analyzed in this study.

Method: 148 medical students processed 15 virtual patient cases in internal medicine. After each case, they were asked to state their final diagnosis and justify it. These explanations were qualitatively analyzed and assigned to one of the following three categories:

correct explanation,
incorrect explanation and
diagnosis guessed right.

Results: The correct diagnosis was made 1,135 times out of 2,080 diagnostic processes. The analysis of the associated diagnostic explanations showed that

92% (1,042) reasoning processes were correct,
7% (80) were incorrect, and
1% (13) of the diagnoses were guessed right.

Causes of incorrect diagnostic processes were primarily a lack of pathophysiological knowledge (50%) and a lack of diagnostic skills (30%).

Conclusion: Generally, if the diagnosis is correct, the diagnostic process is also correct. The rate of guessed diagnoses is quite low at 1%. Nevertheless, about every 14th correct diagnosis is based on a false diagnostic explanation and thus, a wrong diagnostic process. To assess the diagnostic competence, both the diagnosis result and the diagnostic process should be recorded.

Keywords: Clinical reasoning, diagnostic errors, qualitative research, virtual patients


During an average working day, a general practitioner on average sees 45 patients [1] and makes many diagnostic decisions. The rate for a doctor working in a clinic is probably similar. This illustrates the importance of diagnostic competence in everyday clinical practice. Because of this, diagnostic competence is also one of the central topics in medical education research and part of the medical curriculum.

Diagnostic competence can be captured using several parameters: A standard method is to measure the diagnostic accuracy as a result parameter (often binary coded: right vs wrong) [2], [3], [4]. For example, diagnostic efficiency (number of correctly diagnosed cases divided by the time needed for diagnosis) can be used to record diagnostic process quality [2]. For a meaningful assessment of improved diagnostic knowledge, not only factual knowledge but also conditional and procedural knowledge should be recorded (e.g. 3-component test) [5]. For optimal assessment of diagnostic competence it is helpful to combine different assessment methods, such as the diagnostic result and one of the above-mentioned parameters to measure the diagnostic process – Ilgen [6] emphasizes that diagnostic competence does not end with arriving at a correct diagnosis, but the associated diagnostic process too playing a role. Also, the analysis of the causes of diagnostic errors [7] or the cognitive steps during diagnosis [8] can be used to determine diagnostic competence or deficits in the clinical decision process. The causes leading to misdiagnosis were also investigated for medical students [7] based on Graber’s classification [9]: There are eight different cognitive reasons for incorrect diagnoses: lack of diagnostic skills (for example in the interpretation of an ECG), lack of knowledge, faulty context generation, faulty triggering, misidentification (e.g. myocarditis and endocarditis), premature closure, overestimating and underestimating findings, and failure to find any diagnosis at all. One problem here, however, is that it is not yet known how many of the correct diagnoses were guessed right or whether these correct diagnoses are based on a faulty diagnostic process.

While quite a lot is known about the diagnostic process and the explanations for misdiagnoses comparatively speaking, so far only a few studies have been carried out which look at correct diagnoses. Previously, it was assumed that a correct diagnosis is based on a correct diagnosis process with a correct diagnostic explanation, but this has never been empirically verified to our knowledge.

Against this background, the following research question should be answered: How many correct diagnoses are based on a wrong diagnostic process, and what error types are there? How many of the correct diagnoses were guessed right?

To answer this question, the diagnostic explanations of medical students in a controlled setting were qualitatively evaluated.


Study design and participants

This article presents the qualitative data from a large, randomized intervention study analyzing the effects of various scaffolding methods (representation prompts, structured reflection and feedback) on the diagnostic skills of medical students. The quantitative results of this study are part of another publication [10].

During the summer of 2017, 151 advanced medical students in the clinical study section of the Ludwig Maximilian University and the Technical University of Munich processed 15 virtual patient cases on the electronic learning platform CASUS [11]. All subjects volunteered to participate in the study. Prerequisite for participation was the completion of the internal medicine module (6th and 7th semester). Participants were made aware of the study via circulars and notice boards. The cases were carefully piloted with ten students. Following a socio-demographic questionnaire, a test established the participants’ prior knowledge and an introductory video explained the technicalities of the learning platform. The participants then worked on 15 cases, apart from a control group that solved only 10 cases. After the medical history and the physical examination in each case, participants had access to virtual patient records with data from various technical examinations (such as laboratory results, an ECG or an X-ray). Finally, students had to state and justify their final diagnosis. An exemplary typology of two cases, according to Huwendiek et al. is shown in table 1 [Tab. 1] [12]. Participants received a financial allowance of 30 Euros. For the study, a certificate of compliance was issued by the Ethics Committee of the LMU Munich (number 75-16).

Evaluation and statistics

The correct diagnoses were determined in advance by the case authors (LB and KB) and a team of experts consisting of four physicians. Furthermore, it has been determined which information (technical examinations and key terms) had to be included in an explanation in order to classify these as correct.

There was an exploratory examination of all correct diagnoses. After a coding scheme had been developed, all justifications were assigned to one of the following three categories: right reasoning, wrong reasoning and guessed diagnoses. The category “right reasoning” included all explanations in which no wrong statements were made. The “wrong explanation” category included all justifications that contained some objective error, such as a misdiagnosed ECG or an incorrect pathophysiological explanation of symptoms. The category “guessed diagnoses” only included justifications in which the subjects explicitly stated that they had guessed the diagnosis. The definitions of the three categories, as well as suitable examples, are shown in table 2 [Tab. 2].

All wrong explanations were also discussed jointly by two of the authors (LB and RS) and assigned to a further category. The wrong explanations were then subdivided according to which aspects were wrong. The categorization was based on the classification for diagnostic errors by medical students [7]. Four categories were distinguished: lack of diagnostic skills regarding the interpretation of technical examination findings, lack of pathophysiological knowledge, incorrect causal relationships and general uncertainty in the diagnosis.

The statistical evaluation was carried out using SPSS 25.



148 out of 151 participants processed all cases and were included in the data evaluation. The students on average were 25.3 (SD=3.3) years old and had 3.3 (SD=1.0) months of clinical experience. The final average grade was 1.6 (SD=0.6), the grade in internal medicine was 2.2 (SD=1.3), and the oral and written physics grade was 2.3 (SD=1.0) or 2.5 (SD=0.9).

Diagnostic evidence and forms of diagnostic reasoning

In total, over 2,000 diagnostic processes were recorded, of which 814 ended with a misdiagnosis. The correct diagnosis was made in 1,135 diagnostic processes.

Many of the diagnostic explanations (between 86 and 100%) were correct, except for case 7 (heart failure), where only 70% of the explanations were correct. There was no correlation between the overall difficulty of a case – reflected by the diagnostic accuracy – and the rate of erroneous reasoning (see table 3 [Tab. 3]). Almost none of the correct solutions were guessed: the rate of correctly guessed diagnoses was between 0 and 4.5% per case (see table 3 [Tab. 3]). The diagnostic explanations did not improve in thematically similar cases with the same diagnosis.

In all cases, 80 reasons were wrong (7%). These were assigned to the four categories mentioned above. Examples of all categories are shown in table 4 [Tab. 4]. Lack of pathophysiological knowledge was the most common reason for errors, with 50% (40 out of 80 errors), followed by a lack of diagnostic skills (30%).


We were able to show in this study that a faulty diagnostic process was behind 7% of correct diagnoses. There are four different causes for these errors: Lack of pathophysiological knowledge, lack of diagnostic skills, incorrect causal relationships and the inability to reduce the diagnostic uncertainty through the diagnostic process.

Considering the results, the following aspects are striking: The number of correctly guessed diagnoses is low, and clearly below the statistically expected rate of probability: apart from cases 10 and 15, the cases all had the main symptom of dyspnoea, for which only a limited number of diagnoses is possible other than exotic diagnoses and atypical outcomes.

Each case was designed to have 3 possible differential diagnoses following examination of the medical history; the additional information (physical examination and technical examination) then in each case pointed towards a specific diagnosis. Therefore, an approximate probability rate of about 30% can be assumed. However, very few students randomly made the right diagnosis. There was a well-thought-out diagnostic process behind almost all mentioned diagnoses. In another study, it was demonstrated regarding incorrect diagnoses that only a small number of diagnoses are due to a complete lack of knowledge [7].

Overall, this is a good result: Diagnoses are not guessed in experimental and virtual settings but are usually based on a well-thought-out – even if incorrect – diagnostic process.

Comparable to the causes of diagnostic errors, similar sources of errors could also be identified in this study. A lack of knowledge and a lack of diagnostic skills should not be underestimated as a source of errors despite the somewhat contradictory study situation [9], [13] and should be addressed in the medical curriculum. Overall, it has been confirmed that determining the diagnostic process is essential because giving a correct diagnosis does not always imply a faultless diagnostic process. For future studies in the field of clinical reasoning, therefore, both the diagnostic result and the diagnostic process should be recorded in order to gain a comprehensive picture of a person’s diagnostic competence. Computer-aided methods of text analysis could be helpful here.

Despite a large number of over 2,000 diagnostic processes analyzed, the results of this study are limited to the field of internal medicine and should also be replicated with cases from other specializations. Furthermore, we were only about to formulate statements regarding the diagnostic processes of medical students; it is not possible to draw conclusions about other levels of expertise.

It is advantageous that the diagnostic processes were not disturbed by the methodology of our study design – as can be the case with the use of think-aloud-protocols, for example [14]. For teaching purposes, it would be desirable in the future, if in addition to feedback on the case solution in the processing of virtual patient cases, individual feedback on the reasoning could be given.


In this study, for the first time, the diagnostic explanations of correct diagnoses were analyzed in a controlled setting. 7% of the correct diagnoses are based on erroneous diagnostic processes; 1% of the diagnoses were simply guessed right.

Competing interests

The authors declare that they have no competing interests.


Rieser S. Ärztemonitor: Zufrieden - aber es fehlt an Zeit. Dtsch Arztebl Int. 2014;111(29-30):1278.
Braun LT, Zottmann JM, Adolf C, Lottspeich C, Then C, Wirth S, Fischer MR, Schmidmaier R. Representation scaffolds improve diagnostic efficiency in medical students. Med Educ. 2017;51(11):1118-1126. DOI: 10.1111/medu.13355 External link
Mamede S, van Gog T, Sampaio AM, de Faria RM, Maria JP, Schmidt HG. How can students' diagnostic competence benefit most from practice with clinical cases? The effects of structured reflection on future diagnosis of the same and novel diseases. Acad Med. 2014;89(1):121-127. DOI: 10.1097/ACM.0000000000000076 External link
Ilgen JS, Bowen JL, McIntyre LA, Banh KV, Barnes D, Coates WC, Druck J, Fix ML, Rimple D, Yarris LM, Eva KW. Comparing diagnostic performance and the utility of clinical vignette-based assessment under testing conditions designed to encourage either automatic or analytic thought. Acad Med. 2013 Oct;88(10):1545-51. DOI: 10.1097/ACM.0b013e3182a31c1e External link
Schmidmaier R, Eiber S, Ebersbach R, Schiller M, Hege I, Holzer M, Fischer MR. Learning the facts in medical school is not enough: which factors predict successful application of procedural knowledge in a laboratory setting? BMC Med Educ. 2013;13:28. DOI: 10.1186/1472-6920-13-28 External link
Ilgen JS, Eva KW, Regehr G. What's in a Label? Is Diagnosis the Start or the End of Clinical Reasoning? J Gen Intern Med. 2016;31(4):435-437. DOI: 10.1007/s11606-016-3592-7 External link
Braun LT, Zwaan L, Kiesewetter J, Fischer MR, Schmidmaier R. Diagnostic errors by medical students: results of a prospective qualitative study. BMC Med Educ. 2017;17(1):191. DOI: 10.1186/s12909-017-1044-7 External link
Kiesewetter J, Ebersbach R, Gorlitz A, Holzer M, Fischer MR, Schmidmaier R. Cognitive problem solving patterns of medical students correlate with success in diagnostic case solutions. PLoS One. 2013;8(8):e71486. DOI: 10.1371/journal.pone.0071486 External link
Graber ML, Franklin N, Gordon R. Diagnostic error in internal medicine. Arch Intern Med. 2005;165(13):1493-1499.
Braun LT, Borrmann KF, Lottspeich C, Heinrich DA, Kiesewetter J, Fischer MR, Schmidmaier R. Scaffolding clinical reasoning of medical students with virtual patients: effects on diagnostic accuracy, efficiency, and errors. Diagnosis (Berl). 2019;6(2):137-149. DOI: 10.1515/dx-2018-0090 External link
Fischer MR, Schauer S, Gräsel C, Baehring T, Mandl H, Gärtner R, Scherbaum W, Scriba PC. Modellversuch CASUS. Ein computergestütztes Autorensystem für die problemorientierte Lehre in der Medizin [CASUS model trial. A computer-assisted author system for problem-oriented learning in medicine]. Z Arztl Fortbild (Jena). 1996 Aug;90(5):385-9.
Huwendiek S, De leng BA, Zary N, Fischer MR, Ruiz JG, Ellaway R. Towards a typology of virtual patients. Med Teach. 2009;31(8):743-748. DOI: 10.1080/01421590903124708 External link
Zwaan L, de Bruijne M, Wagner C, Thijs A, Smits M, van der Wal G, Timmermann DR. Patient record review of the incidence, consequences, and causes of diagnostic adverse events. Arch Iintern Med. 2010;170(12):1015-1021.
Konrad K. Lautes Denken. Handbuch qualitative Forschung in der Psychologie. Heidelberg: Springer; 2010. p.476-490. DOI: 10.1007/978-3-531-92052-8_34 External link