Modeling Disease with Gene Expression and Clinical Data
Author Information
Author(s): Jan Struyf, Seth Dobrin, David Page
Primary Institution: Katholieke Universiteit Leuven
Hypothesis
Can classification algorithms improve by including demographic and clinical data alongside gene expression data in distinguishing bipolar disorder and schizophrenia from control?
Conclusion
Support vector machines can effectively distinguish bipolar disorder and schizophrenia from normal controls, with improved accuracy when demographic and clinical data are included.
Supporting Evidence
- SVMs achieved an AUC of 0.92 for bipolar disorder versus control and 0.91 for schizophrenia versus control using gene expression data alone.
- Including demographic and clinical data improved AUC to 0.97 for bipolar disorder and 0.94 for schizophrenia.
- Support vector machines outperformed other classification algorithms in distinguishing between the diseases and controls.
Takeaway
Scientists found that using both gene data and information about people's backgrounds helps better identify mental health issues like bipolar disorder and schizophrenia.
Methodology
The study compared six classification algorithms using gene expression data and demographic/clinical data to distinguish between bipolar disorder, schizophrenia, and control groups.
Potential Biases
Potential bias due to the influence of alcohol and drug use on gene expression data.
Limitations
The study is retrospective and may be affected by confounding variables such as alcohol and drug use.
Participant Demographics
The study included 115 schizophrenia patients, 105 bipolar disorder patients, and 112 controls, with demographic data on age, sex, and substance use.
Statistical Information
P-Value
p<0.05
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website