图像流形上的路径:通过视频生成进行图像编辑
Pathways on the Image Manifold: Image Editing via Video Generation
November 25, 2024
作者: Noam Rotstein, Gal Yona, Daniel Silver, Roy Velich, David Bensaïd, Ron Kimmel
cs.AI
摘要
最近,由图像扩散模型推动的图像编辑方面取得了显著进展。然而,仍然存在重大挑战,因为这些模型通常难以准确遵循复杂的编辑指令,并经常通过改变原始图像的关键元素来牺牲保真度。与此同时,视频生成取得了显著进展,具有有效运作的连续世界模拟器模型。在本文中,我们提出通过利用图像到视频模型进行图像编辑,将这两个领域合并。我们重新构想图像编辑为一个时间过程,利用预训练的视频模型从原始图像到所需编辑的平滑过渡。这种方法持续地遍历图像流形,确保一致的编辑同时保留原始图像的关键方面。我们的方法在基于文本的图像编辑方面取得了最先进的结果,显示出在编辑准确性和图像保留方面的显著改进。
English
Recent advances in image editing, driven by image diffusion models, have
shown remarkable progress. However, significant challenges remain, as these
models often struggle to follow complex edit instructions accurately and
frequently compromise fidelity by altering key elements of the original image.
Simultaneously, video generation has made remarkable strides, with models that
effectively function as consistent and continuous world simulators. In this
paper, we propose merging these two fields by utilizing image-to-video models
for image editing. We reformulate image editing as a temporal process, using
pretrained video models to create smooth transitions from the original image to
the desired edit. This approach traverses the image manifold continuously,
ensuring consistent edits while preserving the original image's key aspects.
Our approach achieves state-of-the-art results on text-based image editing,
demonstrating significant improvements in both edit accuracy and image
preservation.Summary
AI-Generated Summary