CLU: A New Algorithm for EST Clustering
Author Information
Author(s): Andrey Ptitsyn, Winston Hide
Primary Institution: Pennington Biomedical Research Center
Hypothesis
The study proposes a new algorithm for clustering expressed sequence tags (ESTs) that improves upon existing methods.
Conclusion
The CLU algorithm represents a new generation of EST clustering with improved performance over current approaches.
Supporting Evidence
- The CLU algorithm automatically ignores low-complexity regions like poly-tracts and short tandem repeats.
- CLU can be applied in small and medium-size projects.
- The program is available on an open source basis free of charge.
Takeaway
The CLU program helps group similar DNA fragments together better than older methods, making it easier for scientists to study genes.
Methodology
The study developed a new nucleotide sequence matching algorithm for clustering EST sequences, which includes a fast linear algorithm and a single-linkage agglomerative clustering method.
Limitations
The current implementation does not keep alignments of clusters for analysis and is limited by the performance of desktop PCs.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website