Integrating Classifications of Protein Domains
Author Information
Author(s): Schaeffer R, Medvedev Kirill E, Andreeva Antonina, Chuguransky Sara Rocio, Pinto Beatriz Lazaro, Zhang Jing, Cong Qian, Bateman Alex, Grishin Nick V
Primary Institution: University of Texas Southwestern Medical Center
Hypothesis
How can we effectively classify protein domains using both experimental and predicted structural data?
Conclusion
The study successfully integrates over 1.8 million protein domains from both experimental and predicted structures into a unified classification system.
Supporting Evidence
- ECOD classifies over 1.8 million domains from over 1 million proteins.
- The integration of Pfam allows for more accurate and less redundant classifications.
- 90% of residues in the classified proteomes were assigned to existing ECOD homologous groups.
Takeaway
Scientists are figuring out how to group proteins based on their shapes and functions, using both real and computer-made models to do it better.
Methodology
The study combines sequence and structural data to classify protein domains, integrating data from the AlphaFold Database and the Protein Data Bank.
Limitations
The classifications from predicted structures do not update the core classification, and there may still be inconsistencies between ECOD and Pfam domain definitions.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website