Protein subfamily assignment using the Conserved Domain Database
2008

Improving Protein Domain Assignments

Sample size: 2929 publication 10 minutes Evidence: moderate

Author Information

Author(s): Jessica H. Fong, Aron Marchler-Bauer

Primary Institution: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health

Hypothesis

Can a new method for assigning protein domains reduce misclassifications in protein sequence databases?

Conclusion

The proposed domain subfamily assignment rule has significantly improved the accuracy of domain annotations in protein sequences.

Supporting Evidence

  • The proposed method achieved 96% accuracy in domain assignments.
  • Using score thresholds reduced misclassification rates significantly.
  • The method has been incorporated into the CD-Search software for improved protein annotation.

Takeaway

This study helps scientists better classify proteins by using a new method that reduces mistakes in labeling protein functions.

Methodology

The study analyzed alignment scores from NCBI-curated domain assignments and proposed heuristics for better classification.

Potential Biases

Potential bias from longer profiles and missing subfamilies could affect classification accuracy.

Limitations

The study may not account for all possible domain subfamilies due to incomplete databases.

Statistical Information

P-Value

0.85

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1756-0500-1-114

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication