Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies
2011

Using Mate-Pairs to Improve Genome Assembly

Sample size: 360 publication Evidence: high

Author Information

Author(s): Joshua Wetzel, Carl Kingsford, Mihai Pop

Primary Institution: Princeton University

Hypothesis

How useful are mate-pairs for resolving repeats in de novo assemblies created from short-reads?

Conclusion

Dramatic improvements in prokaryotic genome assembly quality can be achieved by tuning mate-pair sizes to the actual repeat structure of a genome.

Supporting Evidence

  • Short mate-pairs more effectively disambiguate repeat regions than commonly constructed libraries.
  • The best assemblies can be obtained by tuning mate-pair libraries to accommodate the specific repeat structure of the genome.
  • Results were consistent across 360 simulations and assembly of 8 bacterial genomes.

Takeaway

This study shows that using shorter mate-pairs that match the repeat structure of a genome can help scientists put together the pieces of DNA more accurately.

Methodology

The study involved simulations of ideal sequencing projects and comparisons of different mate-pair library sizes on genome assembly.

Limitations

The results are limited to prokaryotic genomes and may not extend to eukaryotic genomes with different repeat structures.

Digital Object Identifier (DOI)

10.1186/1471-2105-12-95

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication