ChatPaper.aiChatPaper

DMM:基於蒸餾的模型合併構建多功能圖像生成模型

DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

April 16, 2025
作者: Tianhui Song, Weixin Feng, Shuai Wang, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang
cs.AI

摘要

文本到圖像(T2I)生成模型的成功,促進了從同一基礎模型針對各種專業數據集微調而來的眾多模型檢查點的激增。這種過度專業化的模型生產帶來了高參數冗餘和巨大存儲成本的新挑戰,因此迫切需要開發有效的方法來整合和統一多種強大模型的能力於單一模型中。模型合併的常見做法是在參數空間中採用靜態線性插值以實現風格混合的目標。然而,這種方法忽略了T2I生成任務的特點,即眾多不同模型涵蓋了多樣化的風格,這可能導致合併模型中的不相容性和混淆。為解決這一問題,我們引入了一種風格可提示的圖像生成管道,該管道能夠在風格向量的控制下精確生成任意風格的圖像。基於這一設計,我們提出了基於分數蒸餾的模型合併範式(DMM),將多個模型壓縮成一個多功能的T2I模型。此外,我們重新思考並重新表述了在T2I生成背景下的模型合併任務,提出了新的合併目標和評估協議。我們的實驗表明,DMM能夠緊湊地重組來自多個教師模型的知識,並實現可控的任意風格生成。
English
The success of text-to-image (T2I) generation models has spurred a proliferation of numerous model checkpoints fine-tuned from the same base model on various specialized datasets. This overwhelming specialized model production introduces new challenges for high parameter redundancy and huge storage cost, thereby necessitating the development of effective methods to consolidate and unify the capabilities of diverse powerful models into a single one. A common practice in model merging adopts static linear interpolation in the parameter space to achieve the goal of style mixing. However, it neglects the features of T2I generation task that numerous distinct models cover sundry styles which may lead to incompatibility and confusion in the merged model. To address this issue, we introduce a style-promptable image generation pipeline which can accurately generate arbitrary-style images under the control of style vectors. Based on this design, we propose the score distillation based model merging paradigm (DMM), compressing multiple models into a single versatile T2I model. Moreover, we rethink and reformulate the model merging task in the context of T2I generation, by presenting new merging goals and evaluation protocols. Our experiments demonstrate that DMM can compactly reorganize the knowledge from multiple teacher models and achieve controllable arbitrary-style generation.

Summary

AI-Generated Summary

PDF123April 18, 2025