Evaluating Protein Descriptors for Functional Family Prediction
Author Information
Author(s): Ong Serene AK, Lin Hong Huang, Chen Yu Zong, Li Ze Rong, Cao Zhiwei
Primary Institution: National University of Singapore
Hypothesis
Can different protein descriptors improve the prediction of protein functional families using machine learning?
Conclusion
The study suggests that combining descriptor-sets can enhance the predictive performance for classifying proteins.
Supporting Evidence
- Combination-sets showed slightly better performance than individual descriptor-sets.
- Performance evaluation was based on the Matthews correlation coefficient (MCC).
- Three out of four combination-sets consistently outperformed individual descriptor-sets.
Takeaway
This study looked at different ways to describe proteins to see which ones help better predict their functions. Using a mix of descriptions worked best.
Methodology
Support vector machines (SVM) were used to evaluate six individual descriptor-sets and four combination-sets for predicting protein functional families.
Potential Biases
Potential bias from using datasets with homologous sequences.
Limitations
The conclusions may not be applicable to other datasets as the performance is highly dependent on the dataset used.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website