gms | German Medical Science

Information Retrieval Meeting (IRM 2022)

10.06. - 11.06.2022, Köln

Lessons learned by NICE using an AI platform to replicate literature surveillance tasks

Meeting Abstract

  • Patrick Langford - The National Institute for Health and Care Excellence (NICE), United Kingdom
  • Tiffany Chow - Genesis Research, United States of America
  • Michael Raynor - The National Institute for Health and Care Excellence (NICE), United Kingdom
  • Daniel Tuvey - The National Institute for Health and Care Excellence (NICE), United Kingdom
  • Niamh Knapton - The National Institute for Health and Care Excellence (NICE), United Kingdom
  • Monica Casey - The National Institute for Health and Care Excellence (NICE), United Kingdom
  • presenting/speaker Chris L. Pashos - Genesis Research, United States of America
  • corresponding author Matthew Michelson - Genesis Research, United States of America
  • Kay Nolan - The National Institute for Health and Care Excellence (NICE), United Kingdom

Information Retrieval Meeting (IRM 2022). Cologne, 10.-11.06.2022. Düsseldorf: German Medical Science GMS Publishing House; 2022. Doc22irm24

doi: 10.3205/22irm24, urn:nbn:de:0183-22irm245

Veröffentlicht: 8. Juni 2022

© 2022 Langford et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: A challenge in evidence surveillance is the rapid pace of publication and extensive volume of literature. Yet, the literature must be reviewed to determine fit across a number of topics. At NICE, for instance, there are currently over 25,000 evidence based recommendations in its guidance. Therefore, we investigated whether Artificial Intelligence (AI) can mitigate some of the manual effort.

Methods: We used an AI platform [1] that allows one to refine PubMed results using AI-generated filters, in a PICO format. The filters are based on automatically extracted results, associating patient groups or interventions with measured outcomes. For example, one can filter to “survival” results for patients treated with “x”. Leveraging AI for evidence generation, we retrospectively analyzed reproducibility, time-savings, and discovery of additional relevant articles for 4 NICE decision problems/review questions. We tested combinations of search strategies and AI-generated filters to determine if we could reproduce the “gold standard“ from previous human efforts.

Results/Discussion: We compared AI-generated search and sift results to “gold standard” results that were available in the EVID platform for 4 NICE decision problems. This effort focused solely on papers with primary clinical outcomes (e.g., not reviews) that were published in PubMed. In every case, we reproduced the gold standard using at least one search strategy combined with AI filters. In cases where all the papers had been identified, average best time savings was 84% (Table 1 [Tab. 1]). For one, NG50 RQ2, the AI surfaced 3 relevant articles that were not previously found manually.

However, challenges also arose. In each case, several AI search strategies failed to surface all relevant articles. This was usually attributed to users not applying the appropriate AI filters for the searches, leading us to realize more user training is necessary for such technology. Another insight was that simple keyword searches, in some cases followed by use of AI-filters, seemed more efficient than advanced search strings. To be most inclusive, we combined results from several advanced search strings with AI-filtered keyword searches. Finally, the AI performs better with larger datasets – the AI replicates human results more often with larger volumes of data, and shows more time savings.

Conclusions: Using AI for literature surveillance, we successfully validated retrospective results with possible time savings, and in one case, potentially finding a new article of interest. However, using AI requires search strategies that may be less intuitive than traditional literature searching, and the iteration of searching can negate some of the time benefits due to larger pools of returned results. Therefore, efforts to develop awareness and standardization of AI attributes and processes should continue.

Keywords: artificial intelligence, literature surveillance, NICE


References

1.
Michelson M, Chow T, Martin N, Ross M, Tee Qiao Ying A, Minton S. Artificial Intelligence for Rapid Meta-Analysis: Case Study on Ocular Toxicity of Hydroxychloroquine. J Med Internet Res. 2020;22(8):e20007.