Gk Arrays: A New Way to Index Large Read Collections

publication Evidence: high

Author Information

Author(s): Philippe Nicolas, Mikaël Salson, Thierry Lecroq, Martine Léonard, Thérèse Commes, Eric Rivals

Primary Institution: LIRMM, UMR 5506, CNRS and Université de Montpellier

Hypothesis

The study proposes a new data structure, Gk arrays, to efficiently index large collections of reads for bioinformatics analysis.

Conclusion

Gk arrays provide a versatile and efficient method for read analysis, requiring less memory and allowing for faster queries compared to existing methods.

Supporting Evidence

Gk arrays can handle larger read collections with less memory.
The structure allows for fast querying of k-mers in various read analysis contexts.
Gk arrays are available as a C++ library under a GPL compliant license.

Takeaway

This study introduces a new tool that helps scientists quickly find information in large sets of DNA sequences, making it easier to study genes and other biological data.

Methodology

The study developed Gk arrays, a data structure that indexes reads in main memory and allows for efficient querying of k-mers.

Limitations

The study does not address the question of read mapping and focuses solely on read indexing.

Digital Object Identifier (DOI)

10.1186/1471-2105-12-242

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication

Home

Previous Next