GBParsy: A GenBank flatfile parser library with high speed
2008
GBParsy: A Fast GenBank File Parser
publication
Evidence: high
Author Information
Author(s): Lee Tae-Ho, Kim Yeon-Ki, Nahm Baek Hie
Primary Institution: MyongJi University
Hypothesis
Can we develop a faster parser for GenBank flatfiles to improve data processing speed?
Conclusion
GBParsy can extract information from large GenBank flatfiles significantly faster than existing parsers.
Supporting Evidence
- GBParsy is 5 to 100 times faster than current parsers in benchmark tests.
- It can parse 100 Mb of GenBank flatfile in under a second.
- The library was designed to optimize both speed and memory usage.
Takeaway
GBParsy is a tool that helps computers read DNA data much faster than before, making it easier for scientists to work with large amounts of information.
Methodology
Developed a C language-based library that uses optimized algorithms for parsing GenBank flatfiles.
Limitations
The performance may vary based on the complexity of the GBF files being parsed.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website