Using Mate-Pairs to Improve Genome Assembly
Author Information
Author(s): Joshua Wetzel, Carl Kingsford, Mihai Pop
Primary Institution: Princeton University
Hypothesis
How useful are mate-pairs for resolving repeats in de novo assemblies created from short-reads?
Conclusion
Dramatic improvements in prokaryotic genome assembly quality can be achieved by tuning mate-pair sizes to the actual repeat structure of a genome.
Supporting Evidence
- Short mate-pairs more effectively disambiguate repeat regions than commonly constructed libraries.
- The best assemblies can be obtained by tuning mate-pair libraries to accommodate the specific repeat structure of the genome.
- Results were consistent across 360 simulations and assembly of 8 bacterial genomes.
Takeaway
This study shows that using shorter mate-pairs that match the repeat structure of a genome can help scientists put together the pieces of DNA more accurately.
Methodology
The study involved simulations of ideal sequencing projects and comparisons of different mate-pair library sizes on genome assembly.
Limitations
The results are limited to prokaryotic genomes and may not extend to eukaryotic genomes with different repeat structures.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website