LoRACLR:對擴散模型進行對比適應以進行定制

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

December 12, 2024
作者: Enis Simsar, Thomas Hofmann, Federico Tombari, Pinar Yanardag
cs.AI

摘要

最近在文本到圖像定制方面的進展已經實現了高保真度、上下文豐富的個性化圖像生成,使得特定概念可以出現在各種情境中。然而,目前的方法在結合多個個性化模型時存在困難,常常導致屬性交織或需要單獨訓練以保留概念的獨特性。我們提出了LoRACLR,一種新穎的多概念圖像生成方法,將為不同概念進行微調的多個LoRA模型合併為一個統一模型,無需額外進行個別微調。LoRACLR使用對比目標來對齊和合併這些模型的權重空間,確保兼容性同時減少干擾。通過為每個概念強制實現獨特但連貫的表示,LoRACLR實現了高效、可擴展的模型組合,用於高質量的多概念圖像合成。我們的結果突顯了LoRACLR在準確合併多個概念方面的有效性,推動了個性化圖像生成的能力。
English
Recent advances in text-to-image customization have enabled high-fidelity, context-rich generation of personalized images, allowing specific concepts to appear in a variety of scenarios. However, current methods struggle with combining multiple personalized models, often leading to attribute entanglement or requiring separate training to preserve concept distinctiveness. We present LoRACLR, a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model without additional individual fine-tuning. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. By enforcing distinct yet cohesive representations for each concept, LoRACLR enables efficient, scalable model composition for high-quality, multi-concept image synthesis. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.

Summary

AI-Generated Summary

PDF82December 13, 2024