A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis
2008
A New Method for Protein Homology Detection Using Top-n-grams
Sample size: 4352
publication
Evidence: moderate
Author Information
Author(s): Liu Bin, Wang Xiaolong, Lin Lei, Dong Qiwen, Wang Xuan
Primary Institution: Harbin Institute of Technology Shenzhen Graduate School
Hypothesis
Can Top-n-grams improve the performance of protein remote homology detection and fold recognition?
Conclusion
The method based on Top-n-grams significantly outperforms many other methods for protein homology detection.
Supporting Evidence
- Top-n-grams improve prediction performance for remote homology detection.
- The method outperforms traditional methods like N-grams and binary profiles.
- Latent semantic analysis enhances the effectiveness of Top-n-grams.
Takeaway
This study introduces a new way to look at proteins that helps scientists find similarities between them better than before.
Methodology
The study uses Top-n-grams derived from protein sequence frequency profiles and applies SVM classifiers for detection.
Limitations
The method may not perform as well as Profile and SW-PSSM in some cases.
Statistical Information
P-Value
3e-9
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website