Fast algorithms for computing sequence distances by exhaustive substring composition
2008

Fast Algorithms for Computing Sequence Distances

publication Evidence: high

Author Information

Author(s): Alberto Apostolico, Olgert Denas

Hypothesis

Can we develop efficient algorithms for computing sequence distances based on subword composition?

Conclusion

The study presents fast and efficient tools for distance computations based on subword compositions, significantly speeding up previously time-consuming calculations.

Supporting Evidence

  • The algorithms can compute distances in linear time, making them suitable for large genomic datasets.
  • The study demonstrates the effectiveness of using subword compositions for phylogenetic analysis.

Takeaway

This study created a new way to compare DNA sequences quickly, which helps scientists understand how different species are related.

Methodology

The method involves comparing the frequencies of all subwords in two input sequences using a suffix tree structure to compute distances efficiently.

Limitations

The study does not address the performance of the algorithms on very large datasets or the impact of varying parameters on results.

Digital Object Identifier (DOI)

10.1186/1748-7188-3-13

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication