SNPFile – A software library and file format for large scale association mapping and population genetics studies
2008

SNPFile: A New File Format for Genotype Data

publication Evidence: moderate

Author Information

Author(s): Nielsen Jesper, Mailund Thomas

Primary Institution: Bioinformatics Research Center, University of Aarhus, Denmark

Hypothesis

The new binary file format SNPFile will improve the efficiency of storing and manipulating SNP genotype data.

Conclusion

The SNPFile format has significantly reduced the informatics burden in managing secondary data and improved memory and IO efficiency in analysis runs.

Supporting Evidence

  • The SNPFile format allows for efficient storage of both primary and secondary data in a single file.
  • It is designed to be IO efficient for multi-locus analysis methods.
  • The format has been successfully used in the authors' own studies.

Takeaway

SNPFile is a new way to store genetic data that makes it easier and faster to work with large amounts of information.

Methodology

The SNPFile format uses a binary representation to store genotype data and allows for flexible serialization of additional data.

Limitations

The file format is only supported by a limited set of analysis tools developed in the authors' lab.

Digital Object Identifier (DOI)

10.1186/1471-2105-9-526

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication