Developing and validating predictive decision tree models from mining chemical structural fingerprints and high–throughput screening data in PubChem

2008

Predictive Decision Tree Models for Drug Screening

Sample size: 64 publication Evidence: moderate

Author Information

Author(s): Han Lianyi, Wang Yanli, Bryant Stephen H

Primary Institution: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health

Hypothesis

Can decision tree models effectively discriminate compound bioactivities using chemical structure fingerprints and high-throughput screening data?

Conclusion

The decision tree models developed can serve as a virtual screening technique to enhance traditional drug discovery methods.

Supporting Evidence

The decision tree models achieved overall accuracies ranging from 96.9% to 98.9%.
Sensitivity and specificity for the models were reported to be greater than 80% and 98%, respectively.
Enrichment factors of 4.4 and 9.7 were observed for cross-dataset predictions.

Takeaway

This study created computer models that help scientists find which chemicals might work as medicines by looking at their structures and testing them quickly.

Methodology

Decision tree models were developed using chemical structure fingerprints and validated through 10-fold cross-validation on high-throughput screening data.

Potential Biases

Potential bias due to data noise and the imbalance between active and inactive compounds in the datasets.

Limitations

The models may be limited by the known active compounds and the properties used for training, as well as the distribution of the compound collection.

Statistical Information

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1471-2105-9-401

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication

Home