gms | German Medical Science

Qualität der "Neuen" Lehre in der Medizin
Jahrestagung der Gesellschaft für Medizinische Ausbildung (GMA)

04.11. bis 06.11.2005, Münster

A comparison of scoring algorithms for multiple answer MC-exams


Search Medline for

  • corresponding author presenting/speaker Martin Fischer - Medizinische Klinik Innenstadt der Universität München, Schwerpunkt Medizindidaktik, München, Deutschland
  • Daniel Bauer
  • Veronika Kopp

Qualität der "Neuen" Lehre in der Medizin. Jahrestagung der Gesellschaft für Medizinische Ausbildung - GMA. Münster, 04.-06.11.2005. Düsseldorf, Köln: German Medical Science; 2005. Doc05gma091

The electronic version of this article is the complete one and can be found online at:

Received: July 15, 2005
Published: October 26, 2005

© 2005 Fischer et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.



Objectives: To compare different scoring algorithms usually employed in determining students scores in multiple correct answer multiple-choice (MC) exams regarding performance, reliability, selectivity, and item difficulty.

Methods: Data from 420 3rd year medical students' end of term exam in internal medicine in February 2005 at Munich University were analysed (30 MC questions; up to 15 possible answers, up to 6 correct answers per question, at least as many distractors as true answers).

Scoring Algorithms: Each question scored a maximum of one point. No negative scores were applied. We compared:

- "Dichotomous" (D): One point if all true and no wrong answers were chosen.

- "Partial 1" (P1): One point for 100% true answers; 0.5 points for 50% or more true answers; zero points for less than 50% true answers.

- "Partial 2" (P2): A fraction of one point depending on the total number of possible answers was given for each correct decision (picking a right or ignoring a wrong answer); for each wrong decision one such fraction was subtracted.

Results: The P1-algorithm showed best results concerning item selectivity, item difficulties, and internal consistency (Cronbach's alpha), respectively.

Conclusions: The P1-algorithm seems to be the preferable method for the scoring of multiple answer MC-exams.