Crystallizing short-read assemblies around seeds
2009

Short-read Assembly Using Seeds

publication Evidence: moderate

Author Information

Author(s): Hossain Mohammad Sajjad, Azimi Navid, Skiena Steven

Primary Institution: Department of Computer Science, Stony Brook University

Hypothesis

Can a new assembler effectively produce de novo sequence assemblies from short-read data generated by the ABI SOLiD sequencing technology?

Conclusion

The SHORTY assembler demonstrates effective assemblies of bacterial genomes using simulated SOLiD data, outperforming competing assemblers despite limitations with real data.

Supporting Evidence

  • The assembler SHORTY can produce significant assemblies with only 5-10 seeds.
  • N50 contig sizes of around 40 kb were achieved for bacterial genomes.
  • The method allows for high coverage assembly with modest seed requirements.

Takeaway

This study shows that we can use a few starting pieces of DNA to help put together a bigger picture of a genome, even when the pieces are really small.

Methodology

The SHORTY assembler uses a small number of seeds to augment short-read data for de novo assembly, focusing on paired-end microread sequencing.

Limitations

Real data performance is limited by sequencing artifacts, which affect assembly quality.

Digital Object Identifier (DOI)

10.1186/1471-2105-10-S1-S16

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication