MotifCluster: an interactive online tool for clustering and visualizing sequences using shared motifs
2008

MotifCluster: A Tool for Clustering and Visualizing Protein Sequences

Sample size: 4887 publication Evidence: high

Author Information

Author(s): Micah Hamady, Jeremy Widmann, Shelley D. Copley, Rob Knight

Primary Institution: University of Colorado, Boulder, CO, USA

Hypothesis

MotifCluster aims to improve the identification of evolutionary relationships between distantly related protein families by clustering sequences based on shared motifs.

Conclusion

MotifCluster effectively clusters protein sequences based on shared motifs, demonstrating high accuracy with low false positive rates.

Supporting Evidence

  • MotifCluster assigned families to the correct superfamilies with a 0.17% false positive rate.
  • The tool allows users to visualize motifs on protein structures, aiding in functional analysis.
  • Clustering based on motifs provides better insights into evolutionary relationships than traditional methods.

Takeaway

MotifCluster helps scientists group similar proteins by looking at tiny parts they share, making it easier to understand how they are related.

Methodology

MotifCluster uses various distance metrics to cluster sequences based on user-supplied motifs and visualizes these motifs on protein structures.

Potential Biases

Motif-finding algorithms may be biased by the presence of closely related sequences in the input set.

Limitations

The results depend on the order of sequences provided in the input set, which can affect clustering outcomes.

Participant Demographics

The study involved a diverse set of protein sequences from various families and superfamilies.

Statistical Information

P-Value

0.17%

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/gb-2008-9-8-r128

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication