Large-scale clustering of CAGE tag expression data
2007

Clustering of Gene Expression Data Using CAGE Method

Sample size: 127 publication Evidence: moderate

Author Information

Author(s): Shimokawa Kazuro, Okamura-Oho Yuko, Kurita Takio, Frith Martin C, Kawai Jun, Carninci Piero, Hayashizaki Yoshihide

Primary Institution: RIKEN Genomic Sciences Center

Hypothesis

Can a two-step clustering method effectively analyze large-scale transcription start site (TSS) data from CAGE analysis?

Conclusion

The proposed two-step clustering method successfully categorizes a large number of TSSs into meaningful clusters while reducing computational costs.

Supporting Evidence

  • The method categorized 159,075 TSSs into 70-100 clusters.
  • Clusters exhibited biological features such as tissue-specific expression patterns.
  • The two-step method reduced the computational memory required for analysis.

Takeaway

The researchers found a way to group a lot of gene data into smaller, understandable parts, making it easier to study how genes work in different tissues.

Methodology

A two-step clustering method combining k-means and hierarchical clustering was used to analyze CAGE data.

Limitations

The study primarily focused on mouse data, which may limit the generalizability of the findings to other species.

Participant Demographics

Mouse samples from various organs and tissues were used.

Digital Object Identifier (DOI)

10.1186/1471-2105-8-161

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication