Bias detection and correction in RNA-Sequencing data
2011

Detecting and Correcting Biases in RNA-Sequencing Data

publication Evidence: high

Author Information

Author(s): Zheng Wei, Chung Lisa M, Zhao Hongyu

Primary Institution: Yale University

Hypothesis

How do biases in RNA-Seq data affect gene expression estimates?

Conclusion

The proposed method effectively identifies and corrects biases in gene-level expression measures from RNA-Seq data, leading to more accurate estimates.

Supporting Evidence

  • The method reduces bias in gene-level expression estimates more effectively than previous methods.
  • Bias patterns were found to be specific to experimental protocols rather than biological sources.
  • The corrected estimates of gene expression levels agreed better with gold-standard measures like Taqman RT-PCR.

Takeaway

This study shows that RNA-Seq data can be biased, but we can fix these biases to get better results when measuring gene expression.

Methodology

The study used five published RNA-Seq datasets and developed a generalized additive model to correct biases related to gene length, GC content, and dinucleotide frequencies.

Potential Biases

Potential biases related to nucleotide composition and gene length were identified.

Limitations

The method may not address unequal variance issues inherent to RNA-Seq data.

Digital Object Identifier (DOI)

10.1186/1471-2105-12-290

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication