Fast 3D Shape Screening of Large Chemical Databases
Author Information
Author(s): Fontaine Fabien, Bolton Evan, Borodina Yulia, Bryant Stephen H
Primary Institution: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services
Hypothesis
Can alignment-recycling improve the efficiency of 3D shape similarity searches in large chemical databases?
Conclusion
The alignment-recycling method significantly reduces CPU time for shape similarity searches by precomputing a small subset of shape overlays.
Supporting Evidence
- The alignment-recycling method reduces CPU time for shape similarity searches by over 100-fold.
- Using a dataset of over one million compounds, the method achieved better than 80% hit list overlap with traditional methods.
- The study focused on small molecules with less than 28 heavy atoms and less than 6 rotatable bonds.
Takeaway
This study found a way to quickly compare shapes of molecules in big databases, making it much faster to find similar ones.
Methodology
The study used a hybrid methodology called alignment-recycling to efficiently retrieve and align structures with similar 3D shapes from a large dataset of PubChem compounds.
Limitations
The method cannot be used for sub-shape comparisons and may produce poor alignments if a similar reference shape is not present.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website