HMM-ModE: Improved Protein Classification Method
Author Information
Author(s): Prashant K. Srivastava, Dhwani K. Desai, Soumyadeep Nandi, Andrew M. Lynn
Primary Institution: Jawaharlal Nehru University, New Delhi, India
Hypothesis
Can optimizing discrimination thresholds and modifying emission probabilities with negative training sequences improve the classification of protein families using profile hidden Markov models?
Conclusion
The HMM-ModE protocol significantly improves the specificity of protein classification based on molecular function using pre-classified training data.
Supporting Evidence
- The specificity improved from an average of 21% to 98% after optimization.
- The method was validated on sequences from six sub-families of the AGC family of kinases.
- HMM-ModE showed better performance compared to traditional methods in classifying protein kinases.
Takeaway
This study created a new way to classify proteins by using special training data to make better guesses about what proteins do, helping scientists understand them better.
Methodology
The study used profile hidden Markov models (HMMs) with optimized thresholds and modified emission probabilities based on negative training sequences to classify protein families.
Potential Biases
Potential for misclassification if the training data does not adequately represent the diversity of sequences.
Limitations
The method may reduce sensitivity in some cases, particularly when sequences are shorter than the average.
Participant Demographics
The study focused on protein sequences from various sub-families of the AGC family of kinases and G-protein coupled receptors.
Statistical Information
P-Value
p<0.05
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website