CLUSS: Clustering of protein sequences based on a new similarity measure
2007

CLUSS: A New Method for Clustering Protein Sequences

Sample size: 1000 publication 10 minutes Evidence: high

Author Information

Author(s): Kelil Abdellali, Wang Shengrui, Brzezinski Ryszard, Fleury Alain

Primary Institution: Université de Sherbrooke

Hypothesis

Can a novel similarity measure improve the clustering of protein sequences?

Conclusion

CLUSS is an effective method for clustering protein sequences, especially those that are hard to align.

Supporting Evidence

  • CLUSS outperformed existing clustering algorithms in terms of Q-measure.
  • Average Q-measure for CLUSS was over 92% across 1000 tests.
  • CLUSS effectively clustered proteins with known biochemical activities.

Takeaway

Researchers created a new tool called CLUSS to help group similar proteins together, even when they are hard to compare.

Methodology

The study developed a new similarity measure called SMS and used it to create the CLUSS algorithm for clustering protein families.

Potential Biases

The reliance on existing databases for validation may introduce bias in the clustering results.

Limitations

The algorithm relies on pre-determined substitution matrices and may need further optimization for larger datasets.

Participant Demographics

The study involved protein sequences from various databases, including the COG database.

Statistical Information

P-Value

null

Confidence Interval

null

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1471-2105-8-286

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication