Automated Lung Cancer Staging Using Language Models
Author Information
Author(s): Castonguay Alexandre, Lampridis Savvas, Kim Sanghwan, Jang Sowon, Kim Borham, Sunwoo Leonard, Kim Seok, Chung Jin-Haeng, Nam Sejin, Cho Hyeongmin, Lee Donghyoung, Lee Keehyuck, Yoo Sooyoung
Primary Institution: Seoul National University Bundang Hospital
Hypothesis
Can fine-tuned large language models accurately classify pathologic TN stages and generate rationales from lung cancer surgical pathology reports?
Conclusion
The study demonstrates that fine-tuned generative language models can enhance and automate TN classification in lung cancer, improving accuracy and efficiency.
Supporting Evidence
- The Orca2_13b model achieved the highest performance with an EMR of 0.934 and an SMR of 0.864.
- The Orca2_7b model also demonstrated strong performance with an EMR of 0.914.
- Performance was assessed using exact match ratio (EMR) and semantic match ratio (SMR) as evaluation metrics.
Takeaway
This study shows that computers can help doctors quickly and accurately figure out how advanced lung cancer is by reading pathology reports.
Methodology
The study evaluated 6 open-source large language models on 3216 deidentified lung cancer surgical pathology reports, assessing their performance using exact match ratio (EMR) and semantic match ratio (SMR).
Limitations
The study was limited to data from a single tertiary hospital, which may affect the generalizability of the results.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website