Loading Citation Data
Begin by loading your citation data from an Endnote XML file using
the load_search()
function. You can
specify alternative file formats such as CSV if needed.
citations <- load_search("systematic_search.xml", method="endnote")
Automated deduplication
Remove duplicate citations automatically using the
dedup_citations
function.
results <- dedup_citations(citations, merge_citations = TRUE)
#> formatting data...
#> identifying potential duplicates...
#> identified duplicates!
#> flagging potential pairs for manual dedup...
#> Joining with `by = join_by(duplicate_id.x, duplicate_id.y)`
#> 8948 citations loaded...
#> 3508 duplicate citations removed...
#> 5440 unique citations remaining!
The dedup_citations
function returns a list of two
dataframes by default. The first contains unique citations after
duplicates were removed automatically by ASySD. In most cases, this will
remove the vast majority of duplicates. There will likely be some
duplicates remaining which need manual review by a human (see next
step).
unique_citations <- results$unique
Manual deduplication
To address remaining duplicates, review potential pairs manually. You can examine them within R or export them to a CSV or Excel file for detailed inspection.
potential_duplicates <- results$manual_dedup
After reviewing the pairs, create a dataframe containing only the true duplicate pairs. Here, for simplicity, I have retained a subset of the potential_duplicates dataframe. If you exported to a file, you could remove the rows which are not duplicates and re-upload the rows with true duplicates.
true_duplicates <- potential_duplicates[1:16,]
Final Deduplication
Combine the results of automated and manual deduplication using the
dedup_citations_add_manual()
function.
Include any additional duplicates you’ve identified in the
additional_pairs
argument.
final_results <- dedup_citations_add_manual(unique_citations, additional_pairs = true_duplicates)
#> Joining with `by = join_by(record_id)`
Exporting Results
You can now write your results to a file for import into a reference manager or systematic review software
write_citations(final_results, type="txt", filename="citations.txt")