Efficient Computation of Absent Words in Genomic Sequences
Author Information
Author(s): Herold Julia, Kurtz Stefan, Giegerich Robert
Primary Institution: Center of Biotechnology, Bielefeld University
Hypothesis
Can we develop a more efficient algorithm for computing absent words in genomic sequences?
Conclusion
The new algorithm computes absent words for the human genome in 10 minutes on standard hardware, using only 2.5 Mb of space.
Supporting Evidence
- The algorithm computes unwords of human and mouse genomes efficiently.
- It requires only 2.5 Mb of space for computation.
- The program can analyze large genomic datasets quickly.
Takeaway
This study created a new computer program that quickly finds words missing from DNA sequences, which can help scientists understand genomes better.
Methodology
The study developed a new algorithm and software for computing absent words without the need for large index structures.
Limitations
The software may have limitations in handling very large datasets beyond the tested genome sizes.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website