Efficacy of different protein descriptors in predicting protein functional families
2007

Evaluating Protein Descriptors for Functional Family Prediction

publication Evidence: moderate

Author Information

Author(s): Ong Serene AK, Lin Hong Huang, Chen Yu Zong, Li Ze Rong, Cao Zhiwei

Primary Institution: National University of Singapore

Hypothesis

Can different protein descriptors improve the prediction of protein functional families using machine learning?

Conclusion

The study suggests that combining descriptor-sets can enhance the predictive performance for classifying proteins.

Supporting Evidence

  • Combination-sets showed slightly better performance than individual descriptor-sets.
  • Performance evaluation was based on the Matthews correlation coefficient (MCC).
  • Three out of four combination-sets consistently outperformed individual descriptor-sets.

Takeaway

This study looked at different ways to describe proteins to see which ones help better predict their functions. Using a mix of descriptions worked best.

Methodology

Support vector machines (SVM) were used to evaluate six individual descriptor-sets and four combination-sets for predicting protein functional families.

Potential Biases

Potential bias from using datasets with homologous sequences.

Limitations

The conclusions may not be applicable to other datasets as the performance is highly dependent on the dataset used.

Digital Object Identifier (DOI)

10.1186/1471-2105-8-300

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication