ChatPaper.aiChatPaper

AnyDressing:通过潜在扩散模型实现可定制的多服装虚拟试衣

AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

December 5, 2024
作者: Xinghui Li, Qichao Sun, Pengze Zhang, Fulong Ye, Zhichao Liao, Wanquan Feng, Songtao Zhao, Qian He
cs.AI

摘要

最近基于扩散模型的基于服装的图像生成技术在文本和图像提示方面取得了令人印象深刻的进展。然而,现有方法缺乏对各种服饰组合的支持,并且在保留服装细节的同时保持对文本提示的忠实度方面存在困难,从而限制了它们在不同场景下的性能。本文专注于一个新任务,即多服装虚拟试衣,我们提出了一种名为AnyDressing的新方法,用于根据任意组合的服装和个性化文本提示对角色进行定制。AnyDressing包括两个主要网络,分别命名为GarmentsNet和DressingNet,分别用于提取详细的服装特征和生成定制图像。具体而言,我们在GarmentsNet中提出了一种高效且可扩展的模块,称为服装特定特征提取器,用于并行地对服装纹理进行编码。这种设计可以防止服装混淆,同时确保网络效率。与此同时,我们在DressingNet中设计了一种自适应Dressing-Attention机制和一种新颖的实例级服装定位学习策略,以准确地将多服装特征注入到它们对应的区域。这种方法有效地将多服装纹理线索整合到生成的图像中,并进一步增强了文本-图像的一致性。此外,我们引入了一种服装增强纹理学习策略,以改善服装的细粒度纹理细节。由于我们精心设计的优势,AnyDressing可以作为一个插件模块,轻松地与扩散模型的任何社区控制扩展集成,提高了合成图像的多样性和可控性。大量实验证明,AnyDressing取得了最先进的结果。
English
Recent advances in garment-centric image generation from text and image prompts based on diffusion models are impressive. However, existing methods lack support for various combinations of attire, and struggle to preserve the garment details while maintaining faithfulness to the text prompts, limiting their performance across diverse scenarios. In this paper, we focus on a new task, i.e., Multi-Garment Virtual Dressing, and we propose a novel AnyDressing method for customizing characters conditioned on any combination of garments and any personalized text prompts. AnyDressing comprises two primary networks named GarmentsNet and DressingNet, which are respectively dedicated to extracting detailed clothing features and generating customized images. Specifically, we propose an efficient and scalable module called Garment-Specific Feature Extractor in GarmentsNet to individually encode garment textures in parallel. This design prevents garment confusion while ensuring network efficiency. Meanwhile, we design an adaptive Dressing-Attention mechanism and a novel Instance-Level Garment Localization Learning strategy in DressingNet to accurately inject multi-garment features into their corresponding regions. This approach efficiently integrates multi-garment texture cues into generated images and further enhances text-image consistency. Additionally, we introduce a Garment-Enhanced Texture Learning strategy to improve the fine-grained texture details of garments. Thanks to our well-craft design, AnyDressing can serve as a plug-in module to easily integrate with any community control extensions for diffusion models, improving the diversity and controllability of synthesized images. Extensive experiments show that AnyDressing achieves state-of-the-art results.

Summary

AI-Generated Summary

PDF232December 6, 2024