LoRACLR:对扩散模型定制的对比适应

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

December 12, 2024
作者: Enis Simsar, Thomas Hofmann, Federico Tombari, Pinar Yanardag
cs.AI

摘要

最近在文本到图像定制方面取得的进展实现了高保真度、上下文丰富的个性化图像生成,使得特定概念能够出现在各种场景中。然而,当前方法在合并多个个性化模型时存在困难,通常导致属性纠缠或需要单独训练以保持概念的独特性。我们提出了LoRACLR,一种新颖的多概念图像生成方法,将多个LoRA模型(每个模型都经过微调以适应不同概念)合并为单一的统一模型,无需额外的个别微调。LoRACLR使用对比目标来对齐和合并这些模型的权重空间,确保兼容性的同时最小化干扰。通过为每个概念强制执行独特而连贯的表示,LoRACLR实现了高效、可扩展的模型组合,用于高质量、多概念图像合成。我们的结果突显了LoRACLR在准确合并多个概念方面的有效性,推动了个性化图像生成的能力。
English
Recent advances in text-to-image customization have enabled high-fidelity, context-rich generation of personalized images, allowing specific concepts to appear in a variety of scenarios. However, current methods struggle with combining multiple personalized models, often leading to attribute entanglement or requiring separate training to preserve concept distinctiveness. We present LoRACLR, a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model without additional individual fine-tuning. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. By enforcing distinct yet cohesive representations for each concept, LoRACLR enables efficient, scalable model composition for high-quality, multi-concept image synthesis. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.

Summary

AI-Generated Summary

PDF82December 13, 2024