Estimation and correction of non-specific binding in a large-scale spike-in experiment
2007

Correcting Sequence Biases in Microarray Experiments

Sample size: 3859 publication Evidence: moderate

Author Information

Author(s): Eugene F Schuster, Eric Blanc, Linda Partridge, Janet M Thornton

Primary Institution: European Bioinformatics Institute

Hypothesis

The present/absent calls in Affymetrix microarray experiments are influenced by probe sequence, particularly the central nucleotide.

Conclusion

Correcting for probe-sequence biases can improve the performance of the MAS 5.0 algorithm in detecting present or absent transcripts.

Supporting Evidence

  • Probesets with central T nucleotides are more likely to be falsely called present.
  • Using a large dataset allows for better assessment of false discovery rates.
  • Methods that correct for probe-sequence biases outperform the MAS 5.0 algorithm.

Takeaway

This study shows that the way we design probes for gene testing can sometimes lead to mistakes in telling if a gene is present or not, but we can fix this to get better results.

Methodology

The study used a large-scale dataset (GoldenSpike) to analyze the influence of probe sequence on present/absent calls and assessed performance using ROC curves.

Potential Biases

There is a risk of false present calls due to the central nucleotide of PM probes, particularly when they are T nucleotides.

Limitations

The study lacks complete knowledge of the sequence of every clone in the dataset, which may affect the accuracy of classifications.

Participant Demographics

The dataset consists of cRNA samples made from 3,859 unique clones derived from Drosophila.

Statistical Information

P-Value

0.06

Statistical Significance

p<0.06

Digital Object Identifier (DOI)

10.1186/gb-2007-8-6-r126

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication