The Pfam protein families database: embracing AI/ML
2025

The Pfam protein families database: embracing AI/ML

publication Evidence: high

Author Information

Author(s): Paysan-Lafosse Typhaine, Andreeva Antonina, Blum Matthias, Chuguransky Sara Rocio, Grego Tiago, Pinto Beatriz Lazaro, Salazar Gustavo A, Bileschi Maxwell L, Llinares-López Felipe, Meng-Papaxanthos Laetitia, Colwell Lucy J, Grishin Nick V, Schaeffer R Dustin, Clementel Damiano, Tosatto Silvio C E, Sonnhammer Erik, Wood Valerie, Bateman Alex

Primary Institution: European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)

Conclusion

The Pfam database has integrated AI and machine learning to enhance protein family classification and coverage.

Supporting Evidence

  • Pfam has integrated with InterPro to provide a single platform for protein family information.
  • AI techniques have led to significant increases in protein family coverage.
  • New families have been discovered through large-scale sequence similarity analysis.

Takeaway

Pfam is a database that helps scientists understand proteins better, and now it's using smart computer programs to find even more about them.

Methodology

The study involved updating the Pfam database with new protein families and using AI to improve classification.

Limitations

Many protein families remain unclassified, and the transition to the InterPro website may cause confusion among users.

Digital Object Identifier (DOI)

10.1093/nar/gkae997

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication