PASHA: A Fast Assembler for Large Genomes
Author Information
Author(s): Liu Yongchao, Schmidt Bertil, Maskell Douglas L
Primary Institution: Nanyang Technological University, Singapore
Hypothesis
Can a parallelized short read assembler using de Bruijn graphs improve the efficiency and scalability of genome assembly?
Conclusion
PASHA can assemble large genomes with high quality and in reasonable time using modest compute resources.
Supporting Evidence
- PASHA produces higher-quality assemblies than Velvet, ABySS, and SOAPdenovo.
- PASHA completed the human genome assembly in about 21 hours using modest compute resources.
- PASHA is about 2.25 times faster on average than ABySS.
Takeaway
PASHA is a computer program that helps scientists put together DNA sequences quickly and accurately, even for big genomes like humans.
Methodology
PASHA uses hybrid parallelism with shared-memory multi-core CPUs and distributed-memory compute clusters to assemble genomes.
Limitations
The study does not address the performance of PASHA on extremely large genomes beyond the human genome.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website