Binning sequences using very sparse labels within a metagenome
2008

Improving Metagenomic Sequence Binning with S-GSOM

Sample size: 7 publication Evidence: high

Author Information

Author(s): Chan Chon-Kit Kenneth, Hsu Arthur L, Halgamuge Saman K, Tang Sen-Lin

Primary Institution: The University of Melbourne, Australia

Hypothesis

Can a semi-supervised seeding method improve the binning of metagenomic sequences without relying on completed genomes?

Conclusion

The S-GSOM method outperformed existing binning methods and does not require knowledge of completed genomes.

Supporting Evidence

  • S-GSOM showed superior performance compared to other semi-supervised methods tested.
  • S-GSOM can visually identify species without seeds.
  • The method does not require knowledge from known genomes.
  • S-GSOM outperformed k-mer and BLAST methods in binning accuracy.

Takeaway

This study created a new way to group DNA sequences from different species using a small number of examples, making it easier to understand complex microbial communities.

Methodology

The study implemented a semi-supervised seeding method on a Growing Self-Organising Map (GSOM) to bin metagenomic sequences.

Limitations

The method may struggle with poorly defined seeds at the boundaries of clusters.

Digital Object Identifier (DOI)

10.1186/1471-2105-9-215

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication