Fast Algorithms for Computing Sequence Distances
Author Information
Author(s): Alberto Apostolico, Olgert Denas
Hypothesis
Can we develop efficient algorithms for computing sequence distances based on subword composition?
Conclusion
The study presents fast and efficient tools for distance computations based on subword compositions, significantly speeding up previously time-consuming calculations.
Supporting Evidence
- The algorithms can compute distances in linear time, making them suitable for large genomic datasets.
- The study demonstrates the effectiveness of using subword compositions for phylogenetic analysis.
Takeaway
This study created a new way to compare DNA sequences quickly, which helps scientists understand how different species are related.
Methodology
The method involves comparing the frequencies of all subwords in two input sequences using a suffix tree structure to compute distances efficiently.
Limitations
The study does not address the performance of the algorithms on very large datasets or the impact of varying parameters on results.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website