SNPFile: A New File Format for Genotype Data
Author Information
Author(s): Nielsen Jesper, Mailund Thomas
Primary Institution: Bioinformatics Research Center, University of Aarhus, Denmark
Hypothesis
The new binary file format SNPFile will improve the efficiency of storing and manipulating SNP genotype data.
Conclusion
The SNPFile format has significantly reduced the informatics burden in managing secondary data and improved memory and IO efficiency in analysis runs.
Supporting Evidence
- The SNPFile format allows for efficient storage of both primary and secondary data in a single file.
- It is designed to be IO efficient for multi-locus analysis methods.
- The format has been successfully used in the authors' own studies.
Takeaway
SNPFile is a new way to store genetic data that makes it easier and faster to work with large amounts of information.
Methodology
The SNPFile format uses a binary representation to store genotype data and allows for flexible serialization of additional data.
Limitations
The file format is only supported by a limited set of analysis tools developed in the authors' lab.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website