Improving Metagenomic Sequence Binning with S-GSOM
Author Information
Author(s): Chan Chon-Kit Kenneth, Hsu Arthur L, Halgamuge Saman K, Tang Sen-Lin
Primary Institution: The University of Melbourne, Australia
Hypothesis
Can a semi-supervised seeding method improve the binning of metagenomic sequences without relying on completed genomes?
Conclusion
The S-GSOM method outperformed existing binning methods and does not require knowledge of completed genomes.
Supporting Evidence
- S-GSOM showed superior performance compared to other semi-supervised methods tested.
- S-GSOM can visually identify species without seeds.
- The method does not require knowledge from known genomes.
- S-GSOM outperformed k-mer and BLAST methods in binning accuracy.
Takeaway
This study created a new way to group DNA sequences from different species using a small number of examples, making it easier to understand complex microbial communities.
Methodology
The study implemented a semi-supervised seeding method on a Growing Self-Organising Map (GSOM) to bin metagenomic sequences.
Limitations
The method may struggle with poorly defined seeds at the boundaries of clusters.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website