Predicting Disordered Proteins Using Structure-Unknown Data
Author Information
Author(s): Shimizu Kana, Muraoka Yoichi, Hirose Shuichi, Tomii Kentaro, Noguchi Tamotsu
Primary Institution: Waseda University
Hypothesis
Can we improve the prediction of disordered proteins by using a novel method that incorporates structure-unknown protein data?
Conclusion
The proposed method predicts disordered proteins more accurately than existing methods and is less affected by training data sparseness.
Supporting Evidence
- The proposed method achieved a Matthews correlation coefficient (MCC) 0.202 points higher than FoldIndex.
- It predicted 83.4% sensitivity for disordered proteins, outperforming several per-residue predictors.
- The method was less affected by training data sparseness compared to traditional supervised learning methods.
Takeaway
This study created a new way to find proteins that are not structured, which helps scientists understand how these proteins work in the body.
Methodology
The study used a spectral graph transducer for binary classification, incorporating both labeled and unlabeled protein data.
Potential Biases
The study acknowledges potential bias in protein databases against disordered proteins.
Limitations
The method may not accurately predict partially disordered proteins and relies on the quality of unlabeled data.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website