canSAR 2024—an update to the public drug discovery knowledgebase
Author Information
Author(s): Gingrich Phillip W, Chitsazi Rezvan, Biswas Ansuman, Jiang Chunjie, Zhao Li, Tym Joseph E, Brammer Kevin M, Li Jun, Shu Zhigang, Maxwell David S, Tacy Jeffrey A, Mica Ioan L, Darkoh Michael, di Micco Patrizio, Russell Kaitlyn P, Workman Paul, Al-Lazikani Bissan
Primary Institution: University of Texas MD Anderson Cancer Center
Hypothesis
How can integrating diverse data sources improve cancer drug discovery?
Conclusion
The latest updates to canSAR enhance its capabilities for cancer drug discovery by integrating more data and improving algorithms.
Supporting Evidence
- canSAR integrates data from over 25 sources to enhance cancer drug discovery.
- Over 4.5 million compounds and 13.3 million bioactivities are included in canSAR.
- canSAR has identified nearly 600,000 ligandable pockets across protein chains.
Takeaway
canSAR is like a big library that helps scientists find new medicines for cancer by bringing together lots of different information.
Methodology
canSAR integrates data from over 25 sources, including genomic, chemical, and clinical data, and uses machine learning algorithms for analysis.
Potential Biases
The previous method of labeling pockets as undruggable may have biased the model against finding novel druggable sites.
Limitations
The majority of protein pockets are still considered undruggable, and the precision of predictions cannot be quantified due to the nature of the data.
Participant Demographics
Data includes 12,561 tumor samples from 12,520 patients and 19,408 RNAseq samples from 10,955 patients.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website