Resources > EVENTS >
Artur Nowak, Ewelina Sadowska, Ewa Borowiack
Evidence Prime, Krakow, Poland
SuperDeduper is a Laser AI module that performs AI-assisted deduplication of references.
SuperDeduper achieved 100% specificity (0 lost references) as the only one among the general-purpose deduplication tools.
It also achieved 98.05% accuracy, which is second only to SRA Deduplicator (98.53%).
Our results indicate that the SuperDeduper module is a reliable and efficient tool for removing duplicates without requiring human supervision.
The publicly available dataset has limitations. Therefore, as part of the planned prospective validation, we will create a more comprehensive dataset, which we intend to publish.
Removing duplicate records is a crucial step while conducting systematic reviews, including cost and cost-effectiveness outcomes. It’s often a long and laborious process, but properly performed deduplication prevents researchers from reviewing the same references from different databases, reducing the time spent on screening. At the same time, errors in deduplication pose a risk of losing relevant studies before screening even starts.
This problem is addressed by some of the search engines, which allow deduplication of the results at the search stage (e.g., Ovid). However, this approach has limitations since it cannot be applied to all databases.
Deduplication tools can be categorized according to the level of automation and artificial intelligence (AI) support that guides them. They range from manual comparison of key fields in reference management softwares, like Endnote, to tools that manage reviews where deduplication is supported by AI models (e.g., Deduklick). One of the tools that use AI models to remove duplicate references is SuperDeduper which combines classic approaches of comparing fields as well as delegating checks of close matches to AI (semantic similarity). SuperDeduper was built to provide 100% specificity (no references erroneously classified as duplicates) while maximizing accuracy and sensitivity (to minimize unnecessary manual work of reviewing duplicates).
Our aim was to retrospectively evaluate the accuracy of SuperDeduper, using publicly available benchmark datasets that compared multiple existing tools.
We looked for gold-standard datasets that were used to evaluate automatic deduplication processes in at least one reference management or literature review software. To identify existing datasets that could potentially be used to validate SuperDeduper, we conducted a rapid review of validation studies for other deduplication tools. Our search yielded seven relevant articles [1-7]. Of these, three provided access to the datasets used in their validation processes, which could serve as potential resources for validation [1, 5, 7]. We used dataset [1,7] which has been used to validate nearly all available deduplication tools. This approach allows us to directly compare the performance of SuperDeduper with other existing tools, providing a robust benchmark for evaluating its effectiveness in the deduplication process. The benchmark set of deduplicated references was created manually in an Excel sheet based on 3,130 records identified through searches in MEDLINE, Embase, PsycINFO, and the Cochrane Central Register of Controlled Trials (Cochrane CENTRAL).
We ran SuperDeduper on the same dataset to compare the number of false positives (number of unique references erroneously marked as duplicates) and false negatives (duplicate references that were missed by the method).
SuperDeduper is designed to resemble the traditional approach of applying sets of rules for matching bibliographic metadata [8,9], but uses AI to review the potential duplicate pairs (the work that is typically done by humans). This hybrid approach (Fig. 2) has multiple advantages. First, it provides more transparency because it automates the existing best practices instead of replacing them with a black box. Second, it allows customization of the deduplication algorithm, especially if users have a tried-and-tested approach for the set of databases in which they search. Third, it splits the potential duplicate pairs into multiple confidence bands, allowing users to decide which tasks to delegate to AI and which to check manually.
Figure 1. shows the user interface of the Laser AI tool that allows for the review and correcting of SuperDeduper’s decisions.
We compared the results of SuperDeduper with the following 13 tools/methods evaluated on the McKeown dataset [1, 7]: Ovid Multifile search, review softwares: Covidence, Rayyan, Deduklick, Systematic Review Accelerator’s Deduplicator and reference management softwares: EndNote desktop X9, Manual EndNote online classic, EndNote 20, Mendeley Desktop, Zotero, ProQuest RefWorks (similar and exact match). On the same sample of references, SuperDeduper identified 1177 duplicates. There were no false positives (compared to a range of 0-208 false positives in other tools) and 61 false negatives (with other tools showing a range of 35-718 false negatives) [Table 1].
Our results suggest that the SuperDeduper module is a time-saving and effective method to remove duplicates while minimizing human supervision.
The limitation of this experiment is that we used only one dataset to validate the efficacy. However, most identified studies did not make their datasets publicly available or used proprietary data. These findings underscore the need for accessible, standardized datasets to support robust validation efforts in the deduplication space, and they highlight the potential value of sharing datasets for future research and development.
Using the chosen dataset allows for direct comparison with a large number of tools that support the systematic review process. However, this dataset has some limitations, as mentioned in articles [1,7]. It was created from searches conducted on a single platform, Ovid, and dates back to 2018, before databases like Embase and Cochrane Central began including unpublished sources such as clinical trial registries.
In the next step, we will prospectively validate SuperDeduper using more comprehensive datasets. These datasets will cover a larger number of databases and platforms—at least seven per dataset. They will cover various clinical areas and research topics, ensuring that the validation process reflects diverse scientific domains. This diversity in sources and topics will provide a robust foundation for assessing SuperDeduper's effectiveness across different contexts and data sources. We are planning to make the deduplication datasets publicly available.
Meet our team:
Co-founder and the CTO of Evidence Prime. He helps the brightest minds answer the most challenging questions in healthcare through his work in the area of artificial intelligence, especially in the context of systematic review automation. Meet Artur at ISPOR Europe 2023.
Evidence Synthesis Specialist with 15 years of experience in conducting HTA reports, systematic reviews, and targeted literature reviews. At Evidence Prime, she provides methodological knowledge to the designers and the software development team.
Evidence Synthesis Specialist at Evidence Prime. She is responsible for testing new solutions in Laser AI and conducting evidence synthesis research.