Artikel
Automation of duplicate detection for systematic reviews
Suche in Medline nach
Autoren
Veröffentlicht: | 8. Juni 2022 |
---|
Gliederung
Text
Introduction/Background: Systematic reviews (SRs) are considered the best way to answer a research question. However, they are resource intensive, taking on average, five staff, 67 weeks to complete at an average cost of USD $141,000. To overcome this resource burden, systematic review automation (SRA) tools have been developed to improve the speed of SR tasks, without compromising quality. A time-consuming task is to remove duplicate records from search results. This can take even experienced searchers hours to complete. We have designed an SRA tool “the Deduplicator” with the goal of greatly speeding up this process.
Methods: To evaluate the Deduplicator we will compare deduplication done manually and done with the Deduplicator on the following outcomes: 1) time required to deduplicate; 2) numbers of duplicates missed 3) number of non-duplicates removed. Two screeners will independently deduplicate 10 sets of search results. The first screener will do sets 1 to 5 manually, then sets 6 to 10 with the Deduplicator. The second screener will do the opposite, e.g., sets 1 to 5 with the Deduplicator, then sets 6 to 10 manually. If these results are promising, the evaluation will be expanded to a stronger study design, include additional sets of search results and more participants.
Results: The Deduplicator has been tested internally, on a test set of search results from published SRs, 9835 references in total. This testing shows a combined accuracy of 99.04%, (9741 out of 9835 references correctly classified). There was also a substantial time saving, with time for duplicate removal being reduced from one hour to 10 minutes, when done by an experienced person.
Conclusion: Early testing shows the Deduplicator increases the speed of duplicate detection, with no loss of quality. More robust results will be presented at the research conference.
Keywords: systematic reviews, automation, deduplication