ChatPaper.aiChatPaper

DMM:通过基于蒸馏的模型融合构建多功能图像生成模型

DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

April 16, 2025
作者: Tianhui Song, Weixin Feng, Shuai Wang, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang
cs.AI

摘要

文本到图像(T2I)生成模型取得的成功,催生了大量基于同一基础模型在不同专业数据集上微调的模型检查点。这种大规模的专业模型生产带来了参数冗余和存储成本高昂的新挑战,因此亟需开发有效方法,将多种强大模型的能力整合统一到一个模型中。模型合并的常见做法是在参数空间采用静态线性插值,以实现风格混合的目标。然而,这种方法忽视了T2I生成任务的特点,即众多不同模型涵盖了多种风格,可能导致合并后的模型出现不兼容和混淆问题。为解决这一问题,我们引入了一种风格可提示的图像生成流程,能够在风格向量的控制下精确生成任意风格的图像。基于这一设计,我们提出了基于分数蒸馏的模型合并范式(DMM),将多个模型压缩为一个多功能的T2I模型。此外,我们重新思考并重新定义了T2I生成背景下的模型合并任务,提出了新的合并目标和评估协议。实验表明,DMM能够紧凑地重组多个教师模型的知识,并实现可控的任意风格生成。
English
The success of text-to-image (T2I) generation models has spurred a proliferation of numerous model checkpoints fine-tuned from the same base model on various specialized datasets. This overwhelming specialized model production introduces new challenges for high parameter redundancy and huge storage cost, thereby necessitating the development of effective methods to consolidate and unify the capabilities of diverse powerful models into a single one. A common practice in model merging adopts static linear interpolation in the parameter space to achieve the goal of style mixing. However, it neglects the features of T2I generation task that numerous distinct models cover sundry styles which may lead to incompatibility and confusion in the merged model. To address this issue, we introduce a style-promptable image generation pipeline which can accurately generate arbitrary-style images under the control of style vectors. Based on this design, we propose the score distillation based model merging paradigm (DMM), compressing multiple models into a single versatile T2I model. Moreover, we rethink and reformulate the model merging task in the context of T2I generation, by presenting new merging goals and evaluation protocols. Our experiments demonstrate that DMM can compactly reorganize the knowledge from multiple teacher models and achieve controllable arbitrary-style generation.

Summary

AI-Generated Summary

PDF193April 18, 2025