Building Classifiers for Protein Classification
Author Information
Author(s): Rangwala Huzefa, Karypis George
Primary Institution: University of Minnesota
Hypothesis
Can SVM-based multiclass classification effectively solve remote homology detection and fold recognition problems?
Conclusion
Multiclass SVM-based classification approaches are effective for remote homology prediction and fold recognition, especially when using predictions from binary models constructed for ancestral categories.
Supporting Evidence
- The study shows that direct K-way classifiers outperform traditional binary classifiers in protein classification tasks.
- Using hierarchical information improves classification performance by reducing misclassifications.
- The results indicate that simpler models tend to generalize better than more complex models due to limited training data.
Takeaway
This study shows how computers can help scientists figure out which family a protein belongs to based on its sequence, using smart methods that look at many classes at once.
Methodology
The study evaluated various SVM-based multiclass classification methods using datasets derived from the SCOP protein classification.
Limitations
The limited size of training data makes it challenging to learn complex models.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website