Efficient counting of k-mers in DNA sequences using a bloom filter
2011

Counting DNA Sequences Efficiently with Bloom Filters

publication Evidence: moderate

Author Information

Author(s): Melsted Páll, Pritchard Jonathan K

Primary Institution: The University of Chicago

Hypothesis

Can a Bloom filter be used to efficiently count k-mers in DNA sequences?

Conclusion

The proposed method significantly reduces memory usage while counting k-mers in DNA sequences.

Supporting Evidence

  • The method achieves up to 50% savings in memory usage compared to current software.
  • BFCounter is implemented in C++ and is available for free download.
  • The study demonstrates the effectiveness of Bloom filters in bioinformatics applications.

Takeaway

This study shows a way to count pieces of DNA more efficiently, saving computer memory by using a special technique called a Bloom filter.

Methodology

The study uses a Bloom filter to identify and count k-mers that occur more than once in DNA sequences.

Limitations

The method may introduce false positives due to the probabilistic nature of Bloom filters.

Digital Object Identifier (DOI)

10.1186/1471-2105-12-333

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication