gms | German Medical Science

Information Retrieval Meeting (IRM 2022)

10.06. - 11.06.2022, Köln

Papyrus for literature review: visual scoping of gonorrhoea infection

Meeting Abstract

Suche in Medline nach

  • corresponding author presenting/speaker Nicolas Médoc - Luxembourg Institute of Science and Technology, IT for Innovative Services, Esch-sur-Alzette, Luxembourg
  • Jane Whelan - GSK, Clinical and Epidemiology Research and Development, Amsterdam, The Netherlands
  • Ekkehard Beck - GSK, Value Evidence, Wavre, Belgium
  • Mohammad Ghoniem - Luxembourg Institute of Science and Technology, IT for Innovative Services, Esch-sur-Alzette, Luxembourg

Information Retrieval Meeting (IRM 2022). Cologne, 10.-11.06.2022. Düsseldorf: German Medical Science GMS Publishing House; 2022. Doc22irm18

doi: 10.3205/22irm18, urn:nbn:de:0183-22irm184

Veröffentlicht: 8. Juni 2022

© 2022 Médoc et al.
Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung). Lizenz-Angaben siehe http://creativecommons.org/licenses/by/4.0/.


Gliederung

Text

Introduction: Scientific literature is growing exponentially. This results in increasing requirements in time and resources to conduct systematic literature review (SLR). In health science, researchers use frameworks such as Patient Intervention Comparison Outcome (PICO) to specify the research question and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) to report the output of their analyses. Compilation of circa ten thousand references is frequently required to start the review. Reading all abstracts individually to identify the aetiology, risk factors, natural history or related health outcomes or treatments is extremely resource intensive. We therefore propose Papyrus [1], a visual analytics software supporting the exploration of large collections of texts [2].

Methods: Papyrus was designed to 1) produce an overview of textual content and related topics; 2) visually represent closely related aspects to generate or refine hypotheses; 3) drill down into topics to focus on specific aspects and gather evidence for hypothesis validation. Since textual data is unstructured by essence, it requires natural language processing (NLP) and text mining algorithms to extract meaningful patterns and help the analyst retrieve useful information to address the research question.

The NLP pre-processing of Papyrus extracts keywords from all abstracts and the concepts related to health science are annotated with Medical Subject Headings (MeSH). A vector space model is constructed and a topic model algorithm is applied to partition the vocabulary and the document set into topics.

Papyrus supports exploratory tasks through the visualizations presented in Figure 1 [Fig. 1]. First, a map of topics gives an overview of textual contents as a mosaic of word clouds. Next the analyst can scrutinize a topic with an ordered list of topic keywords and can access the related documents to read them on demand. Finally, the topic detail view supports the exploration of specific aspects of a topic by browsing various associations of keywords occurring together in two or more documents.

Results: Papyrus was recently used to conduct a scoping review related to gonorrhoea infection [3]. The use of Papyrus helped to uncover in much lower time 124 health problems, including serious but rare outcomes, in comparison to other review techniques such as SLR and high yield search being able to identify 99 and 53 health problems.

Conclusions: Papyrus is a visual text analytics software showing promising results to identify health outcomes from several thousands of scientific articles. In the future we plan to fully support the workflow of SLR by implementing state-of-the-art frameworks such as PICO and PRISMA. We also aim to analyze how many more rare outcomes can be found with Papyrus, which might become more important considering antimicrobial resistance.

Keywords: visual text mining, natural language processing, topic modelling, systematic literature review, gonorrhoea


References

1.
Médoc N, Ghoniem M. Papyrus [Internet]. Luxembourg Institute of Science And Technology. Available from: https://papyrus.list.lu Externer Link
2.
Médoc N, Ghoniem M, Nadif M. Visual Exploration of Topic Variants Through a Hybrid Biclustering Approach. In: Actes de la 28ième conférence francophone sur l’IHM [Internet]. Fribourg, Switzerland; 2016. p. 103-14. DOI: 10.1145/3004107.3004116 Externer Link
3.
Whelan J, Ghoniem M, Médoc N, Apicella M, Beck E. Applying a novel approach to scoping review incorporating artificial intelligence: mapping the natural history of gonorrhoea. BMC Med Res Methodol. 2021 Sep 6;21(1):183. DOI: 10.1186/s12874-021-01367-x Externer Link