gms | German Medical Science

14. Grazer Konferenz – Qualität der Lehre: New Horizons in Teaching and Learning

22. - 24.04.2010, Wien, Österreich

5-station practical assessment - Can it work?

Poster

Suche in Medline nach

  • corresponding author presenting/speaker Michaela Wagner-Menghin - Medical University of Vienna, Department of Medical Education, Vienna, Austria
  • author Ingrid Preusche - Medical University of Vienna, Department of Medical Education, Vienna, Austria
  • author Michael Schmidts - Medical University of Vienna, Department of Medical Education, Vienna, Austria

14. Grazer Konferenz – Qualität der Lehre: New Horizons in Teaching and Learning. Wien, Österreich, 22.-24.04.2010. Düsseldorf: German Medical Science GMS Publishing House; 2010. Doc10grako44

DOI: 10.3205/10grako44, URN: urn:nbn:de:0183-10grako441

Veröffentlicht: 18. November 2010

© 2010 Wagner-Menghin et al.
Dieser Artikel ist ein Open Access-Artikel und steht unter den Creative Commons Lizenzbedingungen (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.de). Er darf vervielfältigt, verbreitet und öffentlich zugänglich gemacht werden, vorausgesetzt dass Autor und Quelle genannt werden.


Gliederung

Poster

Background: When organizing a practical assessment (PA) for a large cohort, problems emerge with providing enough rooms, time and trained examiners. Stressing the importance of assessment as a drive for learning, we take efforts to reinvent PA for a large cohort setting. OSCEs essential elements are kept, but the resource intensive sampling of stations (usually 15-30 stations are requested) is modified. But is it worth the effort to work with scores lacking the internationally demanded reliability? Is it possible to justifiably identify students potentially performing below acceptable standards (borderline performers)?

Method: 694 year 2 medical students took a 5 station PA (2 basic clinical procedures, 2 physical examination, 1 history taking) assigned semi randomized out of 26 stations, resulting in 3470 student-examiner (SE) encounters instead of 10410 encounters for a 15-station OSCE. To handle the incomplete data matrices out of the semi randomized selection of stations, a Rasch Model (Partial Credit Model)1 is used to

1.
provide evidence of construct validity,
2.
obtain reliability coefficients and
3.
determine the borderline group.

Results:

1.
The 26 stations' psychometric quality is satisfying, indicating construct validity.
2.
The empirical reliability approximates the theoretically achievable reliability (0.60)2 for medium score groups ( 0.57).
3.
Using a 95%CI, n=210 (30%) are identified as borderline performers.

Conclusions: Although low reliability is considered (95%CI), a justifiable pass (or fail) decision can be made with the 5-station PA for 70% (n=484). Investing in 10 additional stations (4840 SE encounters) is not necessary for this group. 30% (n=210) are identified as borderline performers. For them justifying a pass/fail decision according to international standards requires improvement of their score's precision. If we'd continue testing with up to 10 additional stations we would need resources for 2100 additional SE encounters resulting in 5570 SE-encounters for the total exam, which would be only 53% of the encounters in a 15 station OSCE with traditional sampling.

The 5 station-PA provided a justifiable decision for a majority of the sample, thus it is worth the effort. To enhance score's precision for the borderline performers a sequential testing approach is suggested [1], [2].


References

1.
Wright BD, Masters GN. Rating scale analysis. Chicago: Chicago MESA Press; 1982.
2.
Linacre JM. Rasch-based Generalizability Theory. Rasch Meas Trans. 1993;7:283-284.