GBParsy: A GenBank flatfile parser library with high speed
2008

GBParsy: A Fast GenBank File Parser

publication Evidence: high

Author Information

Author(s): Lee Tae-Ho, Kim Yeon-Ki, Nahm Baek Hie

Primary Institution: MyongJi University

Hypothesis

Can we develop a faster parser for GenBank flatfiles to improve data processing speed?

Conclusion

GBParsy can extract information from large GenBank flatfiles significantly faster than existing parsers.

Supporting Evidence

  • GBParsy is 5 to 100 times faster than current parsers in benchmark tests.
  • It can parse 100 Mb of GenBank flatfile in under a second.
  • The library was designed to optimize both speed and memory usage.

Takeaway

GBParsy is a tool that helps computers read DNA data much faster than before, making it easier for scientists to work with large amounts of information.

Methodology

Developed a C language-based library that uses optimized algorithms for parsing GenBank flatfiles.

Limitations

The performance may vary based on the complexity of the GBF files being parsed.

Digital Object Identifier (DOI)

10.1186/1471-2105-9-321

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication