Linkage-based ortholog refinement in bacterial pangenomes with CLARC
2024

Improving Bacterial Pangenome Analysis with CLARC

Sample size: 8898 publication 10 minutes Evidence: high

Author Information

Author(s): González Ojeda Indra, Palace Samantha G., Martinez Pamela P., Azarian Taj, Grant Lindsay R., Hammitt Laura L., Hanage William P., Lipsitch Marc

Primary Institution: Harvard University

Hypothesis

Can the CLARC tool improve the accuracy of pangenome analyses by refining the definitions of core and accessory genes?

Conclusion

The CLARC tool significantly reduces the accessory gene estimates and improves the accuracy of core gene determination in bacterial pangenome analyses.

Supporting Evidence

  • CLARC reduced accessory gene estimates by more than 30%.
  • Using CLARC improved the prediction of post-vaccine population structure in S. pneumoniae.
  • 36% of essential genes were misclassified as accessory genes in the original pangenome analysis.

Takeaway

Scientists created a tool called CLARC to help better understand the genes in bacteria. It helps to group similar genes together, making it easier to study how bacteria evolve.

Methodology

The study used a custom clustering algorithm in the CLARC tool to refine clusters of orthologous genes (COGs) based on sequence identity, functional annotation, and linkage information.

Limitations

CLARC may not resolve misclassifications for multicopy genes and may increase computation time if low-frequency genes are included.

Participant Demographics

The study analyzed 8,898 genomes from various geographic locations, including samples from children and infants.

Digital Object Identifier (DOI)

10.1101/2024.12.18.629228

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication