gms | German Medical Science

68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)

17.09. - 21.09.23, Heilbronn

Semi-automated title-abstract screening using natural language processing and machine learning

Meeting Abstract

Search Medline for

  • Johannes Vey - Institute of Medical Biometry, University of Heidelberg, Heidelberg, Germany
  • Samuel Zimmermann - Institute of Medical Biometry, University of Heidelberg, Heidelberg, Germany
  • Maximilian Pilz - Fraunhofer Institute for Industrial Mathematics, Kaiserslautern, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 68. Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS). Heilbronn, 17.-21.09.2023. Düsseldorf: German Medical Science GMS Publishing House; 2023. DocAbstr. 92

doi: 10.3205/23gmds121, urn:nbn:de:0183-23gmds1216

Published: September 15, 2023

© 2023 Vey et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Introduction: Systematic reviews synthesize all available evidence on a specific research question. A paramount task in this is the comprehensive literature search, which should be as extensive as possible to identify all relevant studies and reduce the risk of reporting bias. The identified studies need to be screened according to defined inclusion criteria to address the research question. Consequently, screening the identified studies is time-consuming, resource intensive and tedious for all researchers involved. In the first stage of this process, the title-abstract screening (TIAB), abstracts of all initially identified studies are screened and classified regarding their inclusion or exclusion for full-text screening. Conventionally, this is accomplished by two independent human reviewers. In recent years, some research has been done to automate the literature search and screening processes [1], [2].

We present a semi-automated framework for TIAB screening using natural language processing (NLP) and machine learning (ML) that was also submitted to the CEN2023 Conference (ID 281).

Methods: The approach was developed and applied within a systematic review project on the reduction of surgical site infection incidence in elective colorectal resections, following Friedrichs et al. [3].

The literature search revealed in total 4460 citations that were suitable for TIAB screening. The titles and abstracts of the publications were processed by methods of NLP to transform the plain language into numerical data representation. The data set was split into a training and test set, whereby the training set was iteratively grown to assess the required sample size for accurate classification. Based on the training data, variable selection conducting Elastic Net regularized regression was performed. Subsequently, different ML algorithms (Elastic Net, Support Vector Machine, Random Forest, and Light Gradient Boosting Machine) were trained using 5-fold cross-validation and grid search for the respective tuning parameters. The AUC values were calculated on the test set for model comparison and the decision of the two human reviewers was used as the reference.

Results: The Random Forest showed the highest performance in the test set (AUC: 96%). Choosing a cut-off to avoid missing any relevant abstract (n=136) resulted in only 755 false positives (FP rate: 26.5%). Conversely, 2089 abstracts were correctly classified as to be excluded (FN rate: 0%). The investigations about the minimal number of abstracts required revealed that at least 1000 abstracts are needed to train the Random Forest validly.

Discussion: We propose an approach where a ML model can replace one human reviewer after being trained on a sufficient number of abstracts. The second reviewer only needs to get involved in cases of discrepancies between the decision of the first reviewer and the classification model. In our case study, the manual TIAB screening workload for the second reviewer could be reduced by about 70%. In general, the workload reduction depends on the accuracy of the ML model and the total amount of identified citations for TIAB screening.

Conclusion: We demonstrated that the developed NLP and ML pipeline could partly automatize TIAB screening. In our systematic review, the TIAB screening burden could be markedly reduced.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.

This contribution has already been published: The abstract has been submitted for oral presentation to the CEN2023 Conference (contribution ID 281; Vey, Johannes A.; Zimmermann, Samuel; Pilz, Maximilian. Text classification to automate abstract screening using machine learning) but not been published yet anywhere.


References

1.
O'Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015;4(1):5. DOI: 10.1186/2046-4053-4-5 External link
2.
Lange T, Schwarzer G, Datzmann T, Binder H. Machine learning for identifying relevant publications in updates of systematic reviews of diagnostic test studies. Res Syn Meth. 2021;12(4):506–515. DOI: 10.1002/jrsm.1486 External link
3.
Friedrichs J, Seide S, Vey J, Zimmermann S, Hardt J, Kleeff J, Klose J, Michalski CW, Kieser M, Pilz M, Ronellenfitsch U. Interventions to reduce the incidence of surgical site infection in colorectal resections: systematic review with multicomponent network meta-analysis (INTRISSI): study protocol. BMJ Open. 2021;11(11):e057226. DOI: 10.1136/bmjopen-2021-057226 External link