gms | German Medical Science

Information Retrieval Meeting (IRM 2022)

10.06. - 11.06.2022, Köln

Evaluation of a new approach to automation in the evidence surveillance workflow

Meeting Abstract

Suche in Medline nach

  • corresponding author presenting/speaker James Thomas - University College London, United Kingdom
  • Ian Shemilt - University College London, United Kingdom

Information Retrieval Meeting (IRM 2022). Cologne, 10.-11.06.2022. Düsseldorf: German Medical Science GMS Publishing House; 2022. Doc22irm25

doi: 10.3205/22irm25, urn:nbn:de:0183-22irm253

Veröffentlicht: 8. Juni 2022

© 2022 Thomas et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: The pandemic has highlighted the importance of evidence surveillance and keeping up to date with a rapidly evolving evidence base, but has also shown how difficult this can be when there are huge numbers of new publications are scattered across numerous repositories.

One challenge to effective evidence surveillance has traditionally been the effort involved in running frequent searches across multiple databases, deduplicating, and screening results. Recently, with the introduction of large comprehensive open access bibliographic databases (such as Microsoft Academic and Semantic Scholar), there has been the potential to search a single, regularly updated source.

Methods and materials: An evidence surveillance system was built in the systematic review software EPPI-Reviewer using Microsoft Academic as a source dataset. Every two weeks new records arrive in the system and are automatically associated with potentially relevant reviews using machine learning.

The system was evaluated by addressing two research questions:

1.
Does the source dataset contain all the records required?
2.
Can relevant records be identified efficiently?

Two evaluations were conducted. Research question 1 was addressed by comparing the yields of conventional sources (Medline / Embase) with that of Microsoft Academic in a systematic living map of COVID-19 research. Research question 2 was addressed by comparing the workload involved in identifying records from conventional sources with a machine learning alternative operating on Microsoft Academic.

The second evaluation simulated the performance of the system across 515 Cochrane systematic reviews of effectiveness. Research question 1 was addressed by matching bibliographic records of included studies into Microsoft Academic and examining the characteristics of missing records. Research question 2 was addressed through large-scale simulation analysis of running the automation system across all reviews and examining the workload necessary to maintain a recall above 99%.

Results: Evaluation 1 found that the Microsoft Academic workflow was more cost-effective than the conventional workflow, and resulted in considerably higher recall (i.e. the Microsoft Academic dataset contained more relevant records than conventional sources). Evaluation 2 found that over 99% of relevant journal articles were present in Microsoft Academic, but that trials registry records were absent along with some conference abstracts and dissertations. The automated workflow suggests that over 80% of reviews can be maintained with a screening burden of fewer than 60 records per week.

Conclusions: Evidence surveillance using machine learning and a single comprehensive source appears viable, though further evaluation in other domains is needed, and some specific specialist sources (e.g. trials registries) are likely to be required.

Keywords: systematic review, evidence surveillance, machine learning, automation