Errors in Mycobacterium smegmatis Genome Sequencing
Author Information
Author(s): Caroline Deshayes, Emmanuel Perrodou, Sebastien Gallien, Daniel Euphrasie, Christine Schaeffer, Alain Van-Dorsselaer, Olivier Poch, Odile Lecompte, Jean-Marc Reyrat
Primary Institution: Université Paris Descartes
Hypothesis
Should bacterial interrupted coding sequences (ICDS) be individually verified to produce an informative genome sequence?
Conclusion
A significant proportion of interrupted coding sequences in Mycobacterium smegmatis are due to sequencing errors, which can affect protein predictions and annotations.
Supporting Evidence
- 28 out of 73 investigated interrupted coding sequences were found to be sequencing errors.
- The errors led to significant changes in predicted protein sequences.
- The study suggests that each bacterial ICDS should be investigated individually.
Takeaway
Scientists found that many mistakes in the DNA sequence of a bacteria were just errors from the way it was read, not real changes in the bacteria itself.
Methodology
The study involved resequencing the genome and using mass spectrometry to analyze interrupted coding sequences.
Limitations
The study could not predict whether a given ICDS corresponds to an authentic event or a sequencing error without individual investigation.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website