### Article

## Suitability of the HAM-Nat test and TMS module "basic medical-scientific understanding" for medical school selection

### Search Medline for

### Authors

Received: | November 30, 2011 |
---|---|

Revised: | June 20, 2012 |

Accepted: | June 21, 2012 |

Published: | November 15, 2012 |

### Outline

### Abstract

**Aims: **Tests with natural-scientific content are predictive of the success in the first semesters of medical studies. Some universities in the German speaking countries use the ‘Test for medical studies’ (TMS) for student selection. One of its test modules, namely “medical and scientific comprehension”, measures the ability for deductive reasoning. In contrast, the Hamburg Assessment Test for Medicine, Natural Sciences (HAM-Nat) evaluates knowledge in natural sciences. In this study the predictive power of the HAM-Nat test will be compared to that of the NatDenk test, which is similar to the TMS module “medical and scientific comprehension” in content and structure.

**Methods: **162 medical school beginners volunteered to complete either the HAM-Nat (N=77) or the NatDenk test (N=85) in 2007. Until spring 2011, 84.2% of these successfully completed the first part of the medical state examination in Hamburg. Via different logistic regression models we tested the predictive power of high school grade point average (GPA or “Abiturnote”) and the test results (HAM-Nat and NatDenk) with regard to the study success criterion “first part of the medical state examination passed successfully up to the end of the 7^{th} semester” (Success7Sem). The Odds Ratios (OR) for study success are reported.

**Results: **For both test groups a significant correlation existed between test results and study success (HAM-Nat: OR=2.07; NatDenk: OR=2.58). If both admission criteria are estimated in one model, the main effects (GPA: OR=2.45; test: OR=2.32) and their interaction effect (OR=1.80) are significant in the HAM-Nat test group, whereas in the NatDenk test group only the test result (OR=2.21) significantly contributes to the variance explained.

**Conclusions: **On their own both HAM-Nat and NatDenk have predictive power for study success, but only the HAM-Nat explains additional variance if combined with GPA. The selection according to HAM-Nat and GPA has under the current circumstances of medical school selection (many good applicants and only a limited number of available spaces) the highest predictive power of all models.

### Introduction

University applicants for medicine are confronted with a variety of different admission procedures in Germany [1]. While GPA is legally prescribed as decisive selection criterion for all universities [2], other criteria as tests, interviews or special achievements can be considered by single faculties. The best known German test for university medical courses entry is the “Test For Medical Studies” (TMS) [3]. After the TMS ceased to be applied in Germany post 1996, Baden-Wuerttembergian universities began to consider good test results in admission process in 2007. In the winter semester of 2011/12 10 out of 34 German medical schools granted bonus points for good TMS results.

Since 2008 the “Hamburg Assessment Test for Medicine, Natural Sciences” (HAM-Nat) is applied at the University hospital Hamburg-Eppendorf. HAM-Nat and TMS were specifically developed for the selection process of medical students, but at the same time each test aims at different constructs. The HAM-Nat test aims to examine knowledge of natural science on a higher education level, as relevant to medical studies. The TMS test includes the test module “medical and scientific comprehension” with questions on natural sciences, these however should indicate “deductive reasoning in subject specific areas and contexts”, for which no specialised knowledge is required [4]. This test module is one of four test parts related to medicine, on which the predictive efficiency of the TMS is based [3]. To be prepared for the HAM-Nat, motivated applicants repeat high school topics in the field of natural sciences which at the same time means preparation for the first semesters. In contrast the preparation for the TMS has no relation to the content of the university curriculum and, according to the developers of the test, will only lead to minor improvements of results [3].

The HAM-Nat was developed with the aim of reducing the university dropout rate during the first part of the course. Since GPA (Abiturdurchschnittsnote) shows a corrected predictive power of r=0.58 [5] for university marks during the pre-clinical part of the course, it is a solid predictor for success in the course (results of the first part of the medical state examination), and the question arises what benefit is gained if further tests for medical school selection are applied.

HAM-Nat as well GPA were significant predictors for success in course and studies after the second semester of the 2006 cohort in Hamburg (r=0.31 or respectively r=0.26), thus 13% of the total variance were explained [6]. In this study success was operationalized as number of passed examinations. The contribution of HAM-Nat to the explained variance (9.5%) was higher than the single contribution of the GPA (6.6%). Considering a sub-sample, corresponding to the quota of places allocated by the university itself, no significant correlation of the GPA with the HAM-Nat can be found (r=0.11) due to variance restriction of the GPA [6] indicating that differences in the range of very good GPAs (1.2 -1.7) have no substantial influence on HAM-Nat test results. Due to the variance restriction, GPA no longer has any significant correlation to study success within this group. In contrast, the predictive power of the HAM-Nat in this group falls only slightly from r=0.31 to r=0.26.

The results of different HAM-Nat test versions from pilot studies with university beginners in 2006 and 2007 correlate with GPA between r=0.12 and r=0.34 [6], [7]. The TMS-module “medical and scientific comprehension” belongs to a group of TMS modules correlating with GPA between r=0.28 and r=0.40 [3]. The correlation of the entire TMS with GPA is higher with r=0.36 up to r=0.48. Notwithstanding the TMS developers conclude that school and test measure different aspects of achievement [3], [4]. In the TMS module “medical and scientific comprehension“ the ability is captured to quickly extradite the essential information from written material and, based on this, to draw the correct conclusions (p. 53 in [3]). Studies show that essential for this are an internal representation of the text information in the working memory, the construction of an interference chain, and recoding of text statements into an imaginative picture.

The correlation between the HAM-Nat test with the test module “scientific reasoning” (NatDenk) was explored in context of a second study before the introduction of the HAM-Nat into the Hamburg selection process. This module is structurally and textually similar to the TMS module “medical and scientific comprehension”. The correlation between HAM-Nat and NatDenk was, respective to the test-version, r=0.34 or r=0.21 [7]; thus both tests represent different constructs.

Medical faculties in Austria, Germany, and Switzerland use both knowledge tests like the HAM-Nat as well as ability tests, such as the TMS, in their selection process. The aim is always the optimisation of study success. In this study we compare the predictive efficiency of both tests with regard to study success. As parameter for study success we chose the successful completion of the first part of the medical state examination within the first seven semesters (Success7Sem), since only very few students complete their medical studies at Hamburg University later. The second criterion chosen was the successful completion of the first part of the medical state examination within the first four semesters (*Success4Sem*) since many universities aim at completion within the standard period of study.

### Methods

#### HAM-Nat

The 2007 version of the HAM-Nat consists of 60 multiple choice questions in the field of mathematics, chemistry, physics and biology. The content is formed from thematic areas relevant to medicine and is of German high school standard. Covered topics as well as a self-test consisting of questions from the 2006 and 2007 test versions are available on the webpage of the University Hospital Eppendorf (http://www.uke.de/studienbewerber). One out of five answer choices is correct and participants were allowed 1.5 minutes for answering each question. The questions were developed by secondary school teachers and lecturers of the clinical and theoretical subjects of the medical faculty.

#### Test-module “scientific reasoning”

For the comparison of the HAM-Nat with the TMS-module “medical and scientific comprehension“ as an external criterion, ITB-Consulting GmbH (developers of the TMS) developed a test module (“scientific reasoning”, NatDenk) consisting of 24 multiple-choice questions, similar to the TMS-module in content and structure. One of 5 answer choices for each question fits to an aforementioned natural-science issue. For all 24 questions the participants are allowed a total of 55 minutes. The tasks do not require specific natural-science knowledge, but instead are aimed at comprehensive understanding and the ability of deductive reasoning. The right to conduct the test module was obtained from ITB-Consulting.

#### Study success

Study success was operationalized as “first part of the medical state examination passed up to the end of the 7^{th} semester” (*Success7Sem*). This dichotomous criterion was chosen as main criterion since reducing the number of university dropouts in the pre-clinical study phase is the primary aim of the HAM-Nat. The success probability after the 7^{th} semester is very low. To verify the stability of the models an additional dichotomous criterion for study success was chosen – passing the same examination in the standard period of study after four semesters (*Success4Sem*) - and the correlation between both criteria was calculated. The latter criterion is used specifically in cases when the focus is set on monetary aspects, as scheduled progression of study is financially efficient. For sake of clarity the results for the criterion *Success4Sem* are only discussed if they differ from the main criterion *Success7Sem*.

#### Study Design

The study was part of a wider study on the parallel- and retest-reliability of the HAM-Nat in 2007 [7]. Participation in the study was offered to incoming students of medicine during the orientation phase. Four weeks after the start of the semester all participants were randomly assigned to one of two groups: one group took the HAM-Nat, while the other took the test module “scientific reasoning“ (NatDenk). Participation in the study was voluntary, the actual taking of the tests followed a compulsory course. It was organised by members of our research group and was supervised by lecturers of the medical faculty. The test “scientific reasoning“ was especially conducted for the purpose of the study. The results of the tests were merged with study-success data in May 2011.

#### Sample

The study used a sample of university beginners of the year 2007, which had been chosen due to GPA or had been accepted due to other quotas (waiting time since school graduation, foreigner quota). In the GPA quota only applicants with a GPA<1.6 (Abiturdurchschnittsnote) were admitted. Only data of participants with known GPA were included. The total sample, on which the analysis is based, comprised 162 university beginners. This sample was representative of the entity of university beginners in the orientation phase in terms of gender, age and GPA. Half of the participant (N=77) took the 2007 HAM-Nat, the other half (N=85) took the module “scientific reasoning” (NatDenk) a week later. All participants gave written consent to the use of their data.

#### Statistical Analysis

To depict test and GPA on one scale both were z-transformed. The predictive power of the GPA as well as of the test results (HAM-Nat and NatDenk) in relation to the study success criteria *Success7Sem* and *Success4Sem* were examined via three different logistic regression models. Within-subject factors were each GPA and the test results; between-subject factors were the affiliation to the test group (HAM-Nat vs. NatDenk). The odds ratios and 95% confidence intervals are given. IBM SPSS Statistics, Version 19.0.0 was used for analysis.

### Results

#### Sample

67.3% of all participants successfully completed the first part of the medical state examination in the standard period of study (*Success4Sem*), by spring 2011 (*Success7Sem*) altogether 84.6% had succeeded. Table 1 [Tab. 1] gives an overview of characteristics of the total sample and both sub-samples.

While GPA was significantly related to the overall score of the examination (r=0.24; p=0.004) and the Ham-Nat (r=-0.24; p=0.038), it was not significantly correlated to the NatDenk (r=-0.11; p=0.324). The negative correlation values show that good (low) GPA is associated with good (high) test results.

#### Prediction

The results were evaluated in several steps. First, GPA and test results were examined separately in how far they are related to study success. Subsequently they were incorporated in one overall model which also included the interaction between both criteria.

###### Model 1: GPA

A significant correlation between GPA and study success existed in the total-sample (Odds Ratios: for *Success7Sem* OR=1.72; p=0.006 and for *Success4Sem*, OR=1.98; p=0.000). An OR of 1.72 signifies that the chances to have passed the examination after seven semesters increase by 1.72-fold with every improvement in the GPA by approximately 0.6 points (1 standard deviation).

If the division into separate test groups is considered, GPA has an effect on *Success7Sem* which is significant for the HAM-Nat group and ranges just over the significance threshold of p=0.05 for the NatDenk group (see Table 2 [Tab. 2]). Figure 1 [Fig. 1] depicts this effect: ‘GPA’ as independent variable was dichotomised at the median into good (GPA<1.6) vs. bad (GPA≥1.6), and the estimated probability for study success was illustrated separately for both groups. For the criterion Success4Sem significant correlation existed between Abitur results and study success in both groups (see Table 2 [Tab. 2]).

###### Model 2: Test results (HAM-Nat vs. NatDenk)

A significant correlation exists between the results of both tests and study success if GPA is not considered (see Table 2 [Tab. 2]). With regard to the HAM-Nat the probability to successfully pass the examination doubled per standard derivation of the test result (see Figure 1 [Fig. 1]). For the NatDenk the probability increased slightly more by the factor 2.58. No significant difference existed between the ORs (p=0.909).

###### Model 3: GPA and test results (HAM-Nat vs. NatDenk) with their interaction term

If the effects of both admission criteria as well as their interaction term are estimated in one model, both main effects and their interaction are significant in the HAM-Nat group, whereas in the NatDenk group only the test result provides a significant contribution to the explained variance (see Table 2 [Tab. 2]). This means that with the HAM-Nat test the predictive efficiency depends on the GPA of the tested individual. Especially in the case of a good GPA, the HAM-Nat provides additional information of study success (see Figure 1 [Fig. 1]). On the other hand, GPA does not provide any further information in addition to the NatDenk.

For the criterion *Success4Sem* the interaction between GPA and HAM-Nat is insignificant, and the GPA loses a little of its predictive power. In the NatDenk group now both predictors contribute significantly to the prediction of study success. However, the influence of the GPA is higher than that of the test results.

#### Sensitivity and Specificity

The Receiver Operating Characteristic (ROC) shows the relation between sensitivity and specificity of a test (see Figure 2 [Fig. 2]). ROC-curves can help to illustrate possible effects for different selection quotas.

High sensitivity in our case means to reject as few as possible applicants who can study successfully. The highest sensitivity could be achieved by admittance of all applicants, which corresponds to a selection quota of 100% (thus no selection, right side of the x-axis in Figure 2 [Fig. 2]). All students able to complete the examination would have been given a chance, no one would have been rejected wrongly. However, this also means that everyone not completing the examination would also have been accepted (low specificity, also on the right side of the x-axis).

If our priority is to identify university dropouts, a specific test is required. Laying a stricter selection quota (we move further to the left on the x-axis) results in a reduced sensitivity while specificity increases. Therefore, the accuracy of identifying the ones not passing the examination rises with stricter selection, while at the same time numerous university applicants, who would have passed, would also have been rejected.

A measure of a test’s quality is the “area under the curve” (AUC). It shows in how far a test is able to divide two groups from each other (in our case ‘successful’ vs. ’unsuccessful’). The AUCs for the different models and both tests are depicted in table 3 [Tab. 3]. In both test groups the AUC for Model 1 (GPA alone) does not differ significantly from 0.5 – corresponding to random selection. In contrast, for the tests alone (Model 2), and for the combinations of the tests with GPA (Model 3) the AUC differ significantly from 0.5.

Figure 2 [Fig. 2] shows that particularly in case of a strict selection quota (high specificity, left on the x-axis) by HAM-Nat test results, we identify more successful students as compared to selection based on Abitur results only (Model 3 vs. Model 1). The NatDenk shows a higher sensitivity at lower specificity compared to the Abitur results. This situation stands for a selection process with many available study places and few applicants.

84.6% of the total sample had passed the first part of the medical state examination after 7 semesters, after 4 semesters 67.3% succeeded. We extrapolated the influence of different selection quotas (no selection, ¾, ½ or ¼ of the sample) on *Success7Sem*. In both groups the pass rate was similarly high, namely 85.5% for HAM-Nat and 83.7% for NatDenk (see Table 4 [Tab. 4]). In the HAM-Nat group all students would have been successful if only the top 25% judged by test result and GPA would have been accepted. In the NatDenk group this would have produced a success rate of 90.5%. Here again the effect seen in the ROC-curves is highlighted: a strict selection quota would have been useful for the HAM-Nat, whereas a selection quota of 75% would have been most beneficial for the NatDenk.

### Discussion

According to the Hochschul-Informations-System (HIS) the dropout quota for medical students is, with 5%, relatively small [8]. Nevertheless, due to the fact that the number of applicants clearly exceeds the available places, the potentially best-suited applicants have to be selected. The selection of medical students and dropout quotas increasingly receive attention in debates on the shortage of medical practitioners. Since the completion of the medical studies is a requirement for becoming a qualified doctor, one aim of the selection process is to identify applicants which will successfully complete university. Because dropping out of university is hard to gather numerically, we operationalised the successful completion of studies as “first part of the medical state examination passed after 7 semesters”. In our case 15% of the sample did not meet this criterion. This operationalisation was chosen since only a very limited number of students pass after the 7^{th} semester. Due to the dichotomy of this parameter a comparison with the correlation between test results and the metric study success parameter “grade in the first part of the medical state examination” given in the introduction cannot be made offhand. The reported odds ratios, however, allow for a comparison of the prediction of the individual parameters.

Both the HAM-Nat and the NatDenk, taken for themselves, have predictive power for the criterion *Success7Sem*. However, both tests do differ if we consider the selection practice as regulated by the 'Hochschulrahmengesetz' (Framework Act for Higher Education in Germany) which regulates to combine test results with GPA. While the HAM-Nat offers additional predictive information particularly on the study success of students with good GPA, the NatDenk allows for better differentiation within the group of applicants with worse GPA. Under the current circumstances of medical school selection (many good applicants and a very limited number of available spaces) a selection process combining HAM-Nat and GPA (Model 3) has the highest predictive efficiency. In the group of the top 25% participants in our study (combination of GPA and HAM-Nat), 100% successfully passed the examination in 7 semesters. In combination with the NatDenk results, however, the success quota only improved from non-selected 85.5% to 90.5%.

Transfer of this model to the de facto selection process is not fully possible due to several factors: the small size of the sample, the potential lack of motivation of the participants to endeavour, and the make-up of the sample which included students from groups whose admission e.g. by waiting time or excellent GPA alone can not be influenced by the university. To test the validity of the results, they will have to be replicated with a new cohort of students.

Aim of the study was to investigate whether the HAM-Nat test shows similar correlations with study success as the test module NatDenk. No assertions of the TMS in its entity are given. The assumption has been confirmed that HAM-Nat and NatDenk both relate to different constructs. We consider the HAM-Nat results' relative independence of the GPA an advantage. Thereby, we introduced a criterion which, especially in the group of top performers, enables a differentiation and has predictive power for study success, even within this highly selective sample.

### Acknowledgement

The authors thank the dean of the medical school Prof. Dr. U. Koch-Gromus and Dr. B. Andresen for helpful suggestions and discussions as well as their support. We are much obliged to D. Münch-Harrach and C. Kothe for assisting us with the data handling. N. Feddersen was very helpful during the translation process. This study was supported by the “Förderfonds Lehre des Dekanates der Medizinischen Fakultät Hamburg”.

### References

- 1.
- Hampe W, Hissbach J, Kadmon M, Kadmon G, Klusmann D, Scheutzel P. Wer wird ein guter Arzt? Verfahren zur Auswahl von Studierenden der Human- und Zahnmedizin. Bundesgesundheitsbl Gesundheitsforsch Gesundheitsschutz. 2009;52(8):821-830. DOI: 10.1007/s00103-009-0905-6
- 2.
- Deutscher Bundestag. Hochschulrahmengesetz. BGBI. 2005;I:3835. Zugänglich unter/available from: http://www.bmbf.de/pub/HRG_20050126.pdf
- 3.
- Trost G, Flum F, Fay E, Klieme E, Maichle U, Meyer M, Nauels HU. Evaluation des Tests für Medizinische Studiengänge (TMS): Synopse der Ergebnisse. Bonn: ITB; 1998.
- 4.
- Trost G. Test für Medizinische Studiengänge (TMS): Studien zur Evaluation, 20. Arbeitsbericht: Institut für Test- und Begabungsforschung. Bonn: ITB; 1996.
- 5.
- Trapmann S, Hell B, Weigand S, Schuler H. Die Validität von Schulnoten zur Vorhersage des Studienerfolgs - eine Metaanalyse. Z Padagog Psychol. 2007;21(1):11-27. DOI: 10.1024/1010-0652.21.1.11
- 6.
- Hampe W, Klusmann D, Buhk H, Muench-Harrach D, Harendza S. Reduzierbarkeit der Abbrecherquote im Humanmedizinstudium durch das Hamburger Auswahlverfahren für Medizinische Studiengänge - Naturwissenschaftsteil (HAM-Nat). GMS Z Med Ausbild. 2008;25(2):Doc82. Zugänglich unter/available from: http://www.egms.de/static/de/journals/zma/2008-25/zma000566.shtml
- 7.
- Hissbach J, Klusmann D, Hampe W. Reliabilität des Hamburger Auswahlverfahrens für Medizinische Studiengänge, Naturwissenschaftsteil (HAM-Nat). GMS Z Med Ausbild. 2011;28(3):Doc44. DOI: 10.3205/zma000756
- 8.
- Heublein U, Schmelzer R, Sommer D, Wank J. Die Entwicklung der Schwund- und Studienabbruchquoten an den deutschen Hochschulen. Hannover: HIS Hochschul-Informations-System; 2008.