Small sample issues for microarray-based classification
2001

Small Sample Issues for Microarray-Based Classification

publication Evidence: low

Author Information

Author(s): Edward R. Dougherty

Primary Institution: Texas A&M University

Hypothesis

How do small sample sizes affect the design and performance of classifiers in microarray data analysis?

Conclusion

Small sample sizes significantly complicate the design and error estimation of classifiers based on microarray data.

Supporting Evidence

  • Small samples can lead to a large number of gene sets with low error estimates, which may not reflect true classifier performance.
  • Error estimation becomes biased and less reliable when using small sample sizes.
  • Constrained classifiers can reduce design error but may increase the error of the best possible classifier.

Takeaway

When scientists use small samples to study gene expression, it can lead to mistakes in classifying diseases because there isn't enough data to make accurate predictions.

Methodology

The paper reviews issues related to classifier design, error estimation, and feature selection in the context of small sample sizes in microarray studies.

Potential Biases

The use of small samples can lead to classifiers that appear accurate but are actually misleading due to high variance in error estimates.

Limitations

The review discusses the challenges of small sample sizes, including biased error estimates and the difficulty of selecting features from a large set of variables.

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication