A method for probabilistic mapping between protein structure and function taxonomies through cross training
2008

Mapping Protein Structure and Function

Sample size: 5751 publication Evidence: moderate

Author Information

Author(s): Kshitiz Gupta, Vivek Sehgal, Andre Levchenko

Primary Institution: Johns Hopkins School of Medicine

Hypothesis

Can probabilistic mapping between protein structure and function taxonomies improve classification accuracy?

Conclusion

The study shows that probabilistic relationships between protein classification databases can enhance classification accuracy.

Supporting Evidence

  • The study found significant semantic overlap between SCOP and PROSITE classifications.
  • Cross training improved the accuracy of classifiers by an average of 5.2%.
  • New features like blocks and 2-D elastic profiles reduced time complexity without compromising performance.

Takeaway

This study helps scientists understand how the shape of a protein can tell us what it does, and vice versa.

Methodology

The study used hierarchical cross training of Support Vector Machine classifiers to establish relationships between SCOP and PROSITE databases.

Potential Biases

The method may reflect biases inherent in the taxonomies used, which could affect the results.

Limitations

The study was conducted on partial taxonomies and may not represent the full complexity of protein classification.

Digital Object Identifier (DOI)

10.1186/1472-6807-8-40

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication