TryOffDiff:使用扩散模型进行高保真服装重建的虚拟试穿
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models
November 27, 2024
作者: Riza Velioglu, Petra Bevandic, Robin Chan, Barbara Hammer
cs.AI
摘要
本文介绍了虚拟试穿(VTOFF),这是一项新颖的任务,专注于从穿着衣物的个人的单张照片中生成标准化的服装图像。与传统的虚拟试衣(VTON)不同,后者是为模特穿上数字服装,VTOFF旨在提取一个规范的服装图像,面临着捕捉服装形状、质地和复杂图案的独特挑战。这一明确定义的目标使VTOFF特别适用于评估生成模型中的重建保真度。我们提出了TryOffDiff模型,该模型采用基于SigLIP的视觉调节来适应稳定扩散,以确保高保真度和细节保留。在修改后的VITON-HD数据集上进行的实验表明,我们的方法在基于姿势转移和虚拟试穿的基准方法中表现更好,且需要更少的预处理和后处理步骤。我们的分析揭示了传统图像生成度量不足以评估重建质量,促使我们依赖于DISTS进行更准确的评估。我们的结果突显了VTOFF在增强电子商务应用中的产品图像、推进生成模型评估,并激发未来高保真重建工作的潜力。演示、代码和模型可在以下网址找到:https://rizavelioglu.github.io/tryoffdiff/
English
This paper introduces Virtual Try-Off (VTOFF), a novel task focused on
generating standardized garment images from single photos of clothed
individuals. Unlike traditional Virtual Try-On (VTON), which digitally dresses
models, VTOFF aims to extract a canonical garment image, posing unique
challenges in capturing garment shape, texture, and intricate patterns. This
well-defined target makes VTOFF particularly effective for evaluating
reconstruction fidelity in generative models. We present TryOffDiff, a model
that adapts Stable Diffusion with SigLIP-based visual conditioning to ensure
high fidelity and detail retention. Experiments on a modified VITON-HD dataset
show that our approach outperforms baseline methods based on pose transfer and
virtual try-on with fewer pre- and post-processing steps. Our analysis reveals
that traditional image generation metrics inadequately assess reconstruction
quality, prompting us to rely on DISTS for more accurate evaluation. Our
results highlight the potential of VTOFF to enhance product imagery in
e-commerce applications, advance generative model evaluation, and inspire
future work on high-fidelity reconstruction. Demo, code, and models are
available at: https://rizavelioglu.github.io/tryoffdiff/Summary
AI-Generated Summary