Utilities / Remove duplicate reads from BAM


Given a BAM file, this tool removes reads which have identical external coordinates.



You have to indicate if you have single end or paired end data. In the paired end mode the insert size (ISIZE) has to be correctly set in the BAM file. If multiple read pairs have identical coordinates, only the pair with highest mapping quality is kept. Note that this tool doesn't work for orphan reads or pairs which map to different chromosomes.


Output is a BAM file.


This tool is based on the SAMtools package. Please cite the article The Sequence alignment/map (SAM) format and SAMtools by Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R. and 1000 Genome Project Data Processing Subgroup (2009) Bioinformatics, 25, 2078-9. [PMID: 19505943].