Effects of data transformation and model selection on feature importance in microbiome classification data
2024

Impact of Data Transformation on Microbiome Classification

Sample size: 8500 publication 10 minutes Evidence: moderate

Author Information

Author(s): Zuzanna Karwowska, Oliver Aasmets, Mait Metspalu, Andres Metspalu, Lili Milani, Tõnu Esko, Tomasz Kosciolek

Primary Institution: Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia

Hypothesis

How do different data transformations affect feature importance in microbiome classification?

Conclusion

Microbiome data transformations can significantly influence feature selection but have a limited effect on classification accuracy.

Supporting Evidence

  • Presence-absence transformations performed comparably to abundance-based transformations.
  • Only a small subset of predictors is necessary for accurate classification.
  • Different transformations resulted in comparable classification performance.

Takeaway

This study shows that changing how we look at microbiome data can change which features we think are important, but it doesn't really change how well we can classify healthy and sick people.

Methodology

The study analyzed over 8500 samples from 24 shotgun metagenomic datasets using various data transformations and machine learning algorithms.

Potential Biases

Potential bias due to the reliance on specific data transformations and algorithms.

Limitations

The study focused on commonly used data transformations, which may limit the applicability of findings to other analyses.

Participant Demographics

The study included diverse populations with healthy and diseased individuals.

Statistical Information

P-Value

p<0.05

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/s40168-024-01996-6

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication