gms | German Medical Science

G-I-N Conference 2012

Guidelines International Network

22.08 - 25.08.2012, Berlin

Inter-rater reliability of AGREE II: need for refining the criteria?

Meeting Abstract

  • M. Goossens - Belgian Centre for Evidence Based Medicine vzw (CEBAM), Leuven, Belgium
  • G.E. Bekkering - Belgian Centre for Evidence Based Medicine vzw (CEBAM), Leuven, Belgium; Faculty of Psychology and Educational Sciences Methodology of Educational Scienc, Leuven, Belgium
  • K. Smets - Universiteit Antwerpen, Vakgroep Eerstelijns- en interdisciplinaire zorg, Antwerp, Belgium
  • B. Aertgeerts - Belgian Centre for Evidence Based Medicine vzw (CEBAM), Leuven, Belgium
  • M. Autrique - Vereniging voor Alcohol- en andere Drugproblemen vzw (VAD), Brussels, Belgium
  • P. Van Royen - Universiteit Antwerpen, Vakgroep Eerstelijns- en interdisciplinaire zorg, Antwerp, Belgium
  • K. Hannes - Faculty of Psychology and Educational Sciences Methodology of Educational Scienc, Leuven, Belgium

Guidelines International Network. G-I-N Conference 2012. Berlin, 22.-25.08.2012. Düsseldorf: German Medical Science GMS Publishing House; 2012. DocP030

DOI: 10.3205/12gin142, URN: urn:nbn:de:0183-12gin1424

Published: July 10, 2012

© 2012 Goossens et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc-nd/3.0/deed.en). You are free: to Share – to copy, distribute and transmit the work, provided the original author and source are credited.


Outline

Text

Background:In Belgium, evidence-based practice guidelines on the field of adolescent substance misuse are currently being developed, using the ADAPTE methodology. Guideline appraisal with the AGREE II instrument is one of its important steps.

Context: Research preceding the launch of AGREE II confirmed that the inter-rater reliability of the instrument is 'adequate' to 'satisfactory'. We experienced however considerable differences in rating. This is important for the subsequent ADAPTE process as the methodology subscale of the instrument may be used to reduce high numbers of retrieved guidelines.

Description of best practice: 37 relevant guidelines were identified. Two reviewers appraised independently a first set of 13 guidelines, using the AGREE II instrument according to the manual, without prior pilots or discussions. Score comparison showed remarkable differences; the internal consistency ranged between 0.49 to 0.68 (Cronbach's alpha). Large differences were present, for example in the methodology domains. Differences in sumscores were noticed up to 22 points between two reviewers (on a scale from 1 to 56, i.e. maximum score for the domain methodology). After consultation, it became clear that approach differed between reviewers: one applied AGREE II in a more pragmatic way; the other applied the instrument very strictly and strove for perfection in guideline development.

Lessons for guideline adaptors: Our results suggest that more instructions are needed to improve reliability of AGREE II. We suggest adding clearer criteria (considerations are often too noncommittal) and a pilot evaluation of a different guideline between the reviewers beforehand in order to improve the inter-rater reliability.