Skip to yearly menu bar Skip to main content


San Diego Oral

ControlFusion: A Controllable Image Fusion Network with Language-Vision Degradation Prompts

Linfeng Tang · Yeda Wang · Zhanchuan Cai · Junjun Jiang · Jiayi Ma

Upper Level Ballroom 6CDEF
Thu 4 Dec 10 a.m. PST — 10:20 a.m. PST

Abstract:

Current image fusion methods struggle with real-world composite degradations and lack the flexibility to accommodate user-specific needs. To address this, we propose ControlFusion, a controllable fusion network guided by language-vision prompts that adaptively mitigates composite degradations. On the one hand, we construct a degraded imaging model based on physical mechanisms, such as the Retinex theory and atmospheric scattering principle, to simulate composite degradations and provide a data foundation for addressing realistic degradations. On the other hand, we devise a prompt-modulated restoration and fusion network that dynamically enhances features according to degradation prompts, enabling adaptability to varying degradation levels. To support user-specific preferences in visual quality, a text encoder is incorporated to embed user-defined degradation types and levels as degradation prompts. Moreover, a spatial-frequency collaborative visual adapter is designed to autonomously perceive degradations from source images, thereby reducing complete reliance on user instructions. Extensive experiments demonstrate that ControlFusion outperforms SOTA fusion methods in fusion quality and degradation handling, particularly under real-world and compound degradations.

Chat is not available.