ObjectMate:物件插入和以主題驅動的生成的循環先驗
ObjectMate: A Recurrence Prior for Object Insertion and Subject-Driven Generation
December 11, 2024
作者: Daniel Winter, Asaf Shul, Matan Cohen, Dana Berman, Yael Pritch, Alex Rav-Acha, Yedid Hoshen
cs.AI
摘要
本文介紹了一種無需調整的方法,用於物體插入和以主題驅動的生成。該任務涉及在由圖像或文本指定的場景中,給定多個視圖,將一個物體合成進去。現有方法難以完全滿足該任務的具有挑戰性的目標:(i)將物體與場景無縫合成,具有逼真的姿勢和燈光,以及(ii)保留物體的身份。我們假設實現這些目標需要大規模監督,但手動收集足夠的數據成本太高。本文的關鍵觀察是,許多大量生產的物體在大型未標記數據集的多個圖像中反复出現,處於不同的場景、姿勢和燈光條件下。我們利用這一觀察結果通過檢索同一物體的多種視圖集來創建大量監督。這個強大的配對數據集使我們能夠訓練一個直接的文本到圖像擴散架構,將物體和場景描述映射到合成圖像。我們將我們的方法ObjectMate與最先進的物體插入和以主題驅動的生成方法進行比較,使用單個或多個參考。從實證上看,ObjectMate實現了更優越的身份保留和更逼真的合成。與許多其他多參考方法不同,ObjectMate不需要在測試時進行緩慢的調整。
English
This paper introduces a tuning-free method for both object insertion and
subject-driven generation. The task involves composing an object, given
multiple views, into a scene specified by either an image or text. Existing
methods struggle to fully meet the task's challenging objectives: (i)
seamlessly composing the object into the scene with photorealistic pose and
lighting, and (ii) preserving the object's identity. We hypothesize that
achieving these goals requires large scale supervision, but manually collecting
sufficient data is simply too expensive. The key observation in this paper is
that many mass-produced objects recur across multiple images of large unlabeled
datasets, in different scenes, poses, and lighting conditions. We use this
observation to create massive supervision by retrieving sets of diverse views
of the same object. This powerful paired dataset enables us to train a
straightforward text-to-image diffusion architecture to map the object and
scene descriptions to the composited image. We compare our method, ObjectMate,
with state-of-the-art methods for object insertion and subject-driven
generation, using a single or multiple references. Empirically, ObjectMate
achieves superior identity preservation and more photorealistic composition.
Differently from many other multi-reference methods, ObjectMate does not
require slow test-time tuning.Summary
AI-Generated Summary