MGFusion: a multimodal large language model-guided information perception for infrared and visible image fusion

2024

MGFusion: A New Method for Infrared and Visible Image Fusion

publication Evidence: high

Author Information

Author(s): Yang Zengyi, Li Yunping, Tang Xin, Xie MingHong

Primary Institution: Kunming University of Science and Technology

Can a multimodal large language model improve the quality of infrared and visible image fusion?

The proposed method significantly enhances the quality of fused images by leveraging semantic information from a multimodal large language model.

The proposed method outperforms existing methods in both visual quality and objective evaluation metrics.
Experimental results validate the effectiveness and superiority of the proposed method on multiple public datasets.

This study shows that using a special model can help combine infrared and visible images better, making them clearer and more useful.

The study employs a multimodal large language model to enhance image features and improve fusion quality through a new framework.

The method may not perform as well on other types of image fusion tasks without retraining.

Access the complete publication on the publisher's website