Genome Assembly Has a Major Impact on Gene Content: A Comparison of Annotation in Two Bos Taurus Assemblies
2011

Impact of Genome Assembly Quality on Gene Annotation in Cattle

publication Evidence: high

Author Information

Author(s): Florea Liliana, Souvorov Alexander, Kalbfleisch Theodore S., Salzberg Steven L.

Primary Institution: University of Maryland

Hypothesis

How does the quality of the assembled sequence affect the annotations?

Conclusion

The study found that genome assembly quality significantly impacts gene and SNP annotation, with many genes varying between assemblies due to mis-assembly events and local sequence variations.

Supporting Evidence

  • 40% of the genes varied significantly between assemblies.
  • 660 protein coding genes in the earlier assembly are missing from the later genome's annotation.
  • 15% of the genes have complex structural differences between the two assemblies.

Takeaway

When scientists put together the genetic code of cows, they found that if the assembly isn't done well, it can change how we understand the genes and their functions.

Methodology

The study compared gene and SNP annotations for two different Bos taurus genome assemblies using the same annotation software.

Limitations

The study is limited to two assemblies and may not represent all genome assembly scenarios.

Digital Object Identifier (DOI)

10.1371/journal.pone.0021400

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication