Parallelized short read assembly of large genomes using de Bruijn graphs
2011

PASHA: A Fast Assembler for Large Genomes

Sample size: 4 publication 10 minutes Evidence: high

Author Information

Author(s): Liu Yongchao, Schmidt Bertil, Maskell Douglas L

Primary Institution: Nanyang Technological University, Singapore

Hypothesis

Can a parallelized short read assembler using de Bruijn graphs improve the efficiency and scalability of genome assembly?

Conclusion

PASHA can assemble large genomes with high quality and in reasonable time using modest compute resources.

Supporting Evidence

  • PASHA produces higher-quality assemblies than Velvet, ABySS, and SOAPdenovo.
  • PASHA completed the human genome assembly in about 21 hours using modest compute resources.
  • PASHA is about 2.25 times faster on average than ABySS.

Takeaway

PASHA is a computer program that helps scientists put together DNA sequences quickly and accurately, even for big genomes like humans.

Methodology

PASHA uses hybrid parallelism with shared-memory multi-core CPUs and distributed-memory compute clusters to assemble genomes.

Limitations

The study does not address the performance of PASHA on extremely large genomes beyond the human genome.

Digital Object Identifier (DOI)

10.1186/1471-2105-12-354

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication