Clustering of Gene Expression Data Using CAGE Method
Author Information
Author(s): Shimokawa Kazuro, Okamura-Oho Yuko, Kurita Takio, Frith Martin C, Kawai Jun, Carninci Piero, Hayashizaki Yoshihide
Primary Institution: RIKEN Genomic Sciences Center
Hypothesis
Can a two-step clustering method effectively analyze large-scale transcription start site (TSS) data from CAGE analysis?
Conclusion
The proposed two-step clustering method successfully categorizes a large number of TSSs into meaningful clusters while reducing computational costs.
Supporting Evidence
- The method categorized 159,075 TSSs into 70-100 clusters.
- Clusters exhibited biological features such as tissue-specific expression patterns.
- The two-step method reduced the computational memory required for analysis.
Takeaway
The researchers found a way to group a lot of gene data into smaller, understandable parts, making it easier to study how genes work in different tissues.
Methodology
A two-step clustering method combining k-means and hierarchical clustering was used to analyze CAGE data.
Limitations
The study primarily focused on mouse data, which may limit the generalizability of the findings to other species.
Participant Demographics
Mouse samples from various organs and tissues were used.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website