Short-read Assembly Using Seeds
Author Information
Author(s): Hossain Mohammad Sajjad, Azimi Navid, Skiena Steven
Primary Institution: Department of Computer Science, Stony Brook University
Hypothesis
Can a new assembler effectively produce de novo sequence assemblies from short-read data generated by the ABI SOLiD sequencing technology?
Conclusion
The SHORTY assembler demonstrates effective assemblies of bacterial genomes using simulated SOLiD data, outperforming competing assemblers despite limitations with real data.
Supporting Evidence
- The assembler SHORTY can produce significant assemblies with only 5-10 seeds.
- N50 contig sizes of around 40 kb were achieved for bacterial genomes.
- The method allows for high coverage assembly with modest seed requirements.
Takeaway
This study shows that we can use a few starting pieces of DNA to help put together a bigger picture of a genome, even when the pieces are really small.
Methodology
The SHORTY assembler uses a small number of seeds to augment short-read data for de novo assembly, focusing on paired-end microread sequencing.
Limitations
Real data performance is limited by sequencing artifacts, which affect assembly quality.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website