ChatPaper.aiChatPaper

时尚-VDM:虚拟试穿的视频扩散模型

Fashion-VDM: Video Diffusion Model for Virtual Try-On

October 31, 2024
作者: Johanna Karras, Yingwei Li, Nan Liu, Luyang Zhu, Innfarn Yoo, Andreas Lugmayr, Chris Lee, Ira Kemelmacher-Shlizerman
cs.AI

摘要

我们提出了Fashion-VDM,一种用于生成虚拟试穿视频的视频扩散模型(VDM)。给定一个服装图像和一个人的视频作为输入,我们的方法旨在生成一个高质量的试穿视频,展示穿着给定服装的人,同时保留人的身份和动作。基于图像的虚拟试穿已经取得了令人印象深刻的结果;然而,现有的视频虚拟试穿(VVT)方法仍然缺乏服装细节和时间一致性。为了解决这些问题,我们提出了基于扩散的架构用于视频虚拟试穿,分割无分类器指导以增加对条件输入的控制,并采用渐进式时间训练策略用于单次通过生成64帧、512像素视频。我们还展示了联合图像-视频训练在视频试穿中的有效性,尤其是在视频数据有限时。我们的定性和定量实验表明,我们的方法为视频虚拟试穿设定了新的技术水平。欲了解更多结果,请访问我们的项目页面:https://johannakarras.github.io/Fashion-VDM。
English
We present Fashion-VDM, a video diffusion model (VDM) for generating virtual try-on videos. Given an input garment image and person video, our method aims to generate a high-quality try-on video of the person wearing the given garment, while preserving the person's identity and motion. Image-based virtual try-on has shown impressive results; however, existing video virtual try-on (VVT) methods are still lacking garment details and temporal consistency. To address these issues, we propose a diffusion-based architecture for video virtual try-on, split classifier-free guidance for increased control over the conditioning inputs, and a progressive temporal training strategy for single-pass 64-frame, 512px video generation. We also demonstrate the effectiveness of joint image-video training for video try-on, especially when video data is limited. Our qualitative and quantitative experiments show that our approach sets the new state-of-the-art for video virtual try-on. For additional results, visit our project page: https://johannakarras.github.io/Fashion-VDM.

Summary

AI-Generated Summary

PDF112November 13, 2024