MACSE: Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons
2011

MACSE: A Tool for Aligning Coding Sequences with Frameshifts and Stop Codons

publication Evidence: high

Author Information

Author(s): Vincent Ranwez, Sébastien Harispe, Frédéric Delsuc, Emmanuel J. P. Douzery

Primary Institution: Institut des Sciences de l'Evolution, UMR5554-CNRS, Université Montpellier II, Montpellier, France

Hypothesis

Can we develop an algorithm that aligns nucleotide sequences containing open reading frames while accounting for frameshifts and stop codons?

Conclusion

The MACSE program effectively aligns protein-coding sequences, including those with frameshifts and stop codons, improving the accuracy of multiple sequence alignments.

Supporting Evidence

  • MACSE is the first automatic solution to align protein-coding gene datasets containing non-functional sequences without disrupting the underlying codon structure.
  • MACSE has been shown to detect undocumented frameshifts in public database sequences.
  • The program can align high-throughput sequencing reads against reference coding sequences effectively.

Takeaway

MACSE is a computer program that helps scientists line up DNA sequences, even when there are mistakes in the sequences, so they can study them better.

Methodology

The study presents an algorithm that extends the classical Needleman-Wunsch algorithm to accommodate sequencing errors and biological deviations, implemented in the MACSE program for multiple sequence alignment.

Limitations

The algorithm may not handle unexpected frameshifting substitutions optimally, and the computational time is longer compared to some existing methods.

Digital Object Identifier (DOI)

10.1371/journal.pone.0022594

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication