Detecting and Correcting Biases in RNA-Sequencing Data
Author Information
Author(s): Zheng Wei, Chung Lisa M, Zhao Hongyu
Primary Institution: Yale University
Hypothesis
How do biases in RNA-Seq data affect gene expression estimates?
Conclusion
The proposed method effectively identifies and corrects biases in gene-level expression measures from RNA-Seq data, leading to more accurate estimates.
Supporting Evidence
- The method reduces bias in gene-level expression estimates more effectively than previous methods.
- Bias patterns were found to be specific to experimental protocols rather than biological sources.
- The corrected estimates of gene expression levels agreed better with gold-standard measures like Taqman RT-PCR.
Takeaway
This study shows that RNA-Seq data can be biased, but we can fix these biases to get better results when measuring gene expression.
Methodology
The study used five published RNA-Seq datasets and developed a generalized additive model to correct biases related to gene length, GC content, and dinucleotide frequencies.
Potential Biases
Potential biases related to nucleotide composition and gene length were identified.
Limitations
The method may not address unequal variance issues inherent to RNA-Seq data.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website