Article
Performance of the automated abstract screening tool Rayyan for scoping reviews: evidence from five reviews
Search Medline for
Authors
Published: | March 27, 2025 |
---|
Outline
Text
Background/research question: In a survey among professionals conducting systematic reviews, 89% of respondents reported using automation tools, primarily Covidence, RevMan and Rayyan [1]. These tools have shown to efficiently prioritize records during abstract screening, with 95% of relevant records included within the top 50% of highest prioritized records [2], [3]). Although being predominantly designed to support systematic reviews, these tools are increasingly applied to scoping reviews. Scoping reviews differ from systematic reviews by focusing on broader research questions to map evidence rather than critically appraising it. Although they are a frequently used evidence synthesis method, studies on the efficiency of automation tools in scoping reviews are scarce. This study evaluates Rayyan’s accuracy and efficiency in identifying relevant records for scoping reviews.
Methods: This study drew on data from five scoping reviews conducted between 2022 and 2024, covering diverse topics in the context of digital health. Maintaining the imbalance of each scoping review, subsets of 2,500 records per review were created, except for one review with a smaller dataset. Rayyan was trained on a random sample of at least 50 records with at least five inclusions. Record decisions were based on the gold standard of two independent human reviewers. Ratings were computed, records ranked and inclusion/exclusion predictions noted at each 10% screening advancement. Rayyan’s performance and prioritization efficiency was calculated using Stata V.16.
Results: Although Rayyan made no prediction about inclusion, it efficiently prioritized records, identifying 95% of relevant records within the first 50% screened across all scoping reviews. After screening 20% of records, Rayyan predicted the exclusion of 65%, with additional predictions made in later stages. Initially challenged by borderline cases, Rayyan’s predictive accuracy improved throughout the screening process and the ranking of individual records became increasingly stable.
Conclusion: Our evaluation of the automation tool Rayyan for record screening in scoping reviews yielded promising results, demonstrating performance on par with that in systematic reviews. However, further research is needed to understand the factors influencing the performance of automation tools in scoping reviews, determine the ideal stopping criteria to maximize time saving, and explore the potential of large language models in the review process.
Competing interests: N/A
References
- 1.
- Scott AM, Forbes C, Clark J, Carter M, Glasziou P, Munn Z. Systematic review automation tools improve efficiency but lack of knowledge impedes their adoption: a survey. J Clin Epidemiol. 2021 Oct;138:80-94. DOI: 10.1016/j.jclinepi.2021.06.030
- 2.
- Chai KEK, Lines RLJ, Gucciardi DF, Ng L. Research Screener: a machine learning tool to semi-automate abstract screening for systematic reviews. Syst Rev. 2021 Apr 1;10(1):93. DOI: 10.1186/s13643-021-01635-3
- 3.
- Hamel C, Hersi M, Kelly SE, Tricco AC, Straus S, Wells G, Pham B, Hutton B. Guidance for using artificial intelligence for title and abstract screening while conducting knowledge syntheses. BMC Med Res Methodol. 2021 Dec 20;21(1):285. DOI: 10.1186/s12874-021-01451-2