Boosting with stumps for predicting transcription start sites
2007

CoreBoost: A New Program for Predicting Transcription Start Sites

Sample size: 1445 publication Evidence: high

Author Information

Author(s): Zhao Xiaoyue, Xuan Zhenyu, Zhang Michael Q

Primary Institution: Cold Spring Harbor Laboratory

Hypothesis

Can a boosting technique improve the prediction of transcription start sites in gene finding?

Conclusion

CoreBoost significantly improves the accuracy of locating transcription start sites compared to existing methods.

Supporting Evidence

  • CoreBoost achieved 39% maximal scores within 50 bp of an annotated TSS, outperforming McPromoter and Eponine.
  • The study demonstrated that tissue-specific information can improve prediction accuracy.
  • CoreBoost uses a two-step approach to promoter recognition and TSS mapping.

Takeaway

CoreBoost is a computer program that helps scientists find where genes start in DNA, making it easier to understand how genes work.

Methodology

CoreBoost uses a boosting technique with decision trees to select important features for predicting transcription start sites.

Potential Biases

There is a systematic downstream bias in the ChIP-chip data used for evaluation.

Limitations

The performance of CoreBoost on non-CpG-related promoters remains unsatisfactory.

Statistical Information

P-Value

8 × 10^-9

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/gb-2007-8-2-r17

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication