StoryMaker:朝向在文本到圖像生成中具有整體一致性的角色
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
September 19, 2024
作者: Zhengguang Zhou, Jing Li, Huaxia Li, Nemo Chen, Xu Tang
cs.AI
摘要
無需調整的個性化圖像生成方法在保持面部一致性(即身份)方面取得了顯著成功,即使涉及多個角色也是如此。然而,在涉及多個角色的場景中缺乏整體一致性,阻礙了這些方法創建連貫敘事的能力。在本文中,我們介紹了StoryMaker,一種個性化解決方案,它不僅保留了面部一致性,還包括服裝、髮型和身體一致性,從而有助於通過一系列圖像創建故事。StoryMaker結合了基於面部身份和裁剪角色圖像的條件,其中包括服裝、髮型和身體。具體來說,我們使用位置感知感知器重採樣器(PPR)將面部身份信息與裁剪的角色圖像整合,以獲得獨特的角色特徵。為了防止多個角色和背景之間的混合,我們使用MSE損失與分割遮罩分別限制不同角色和背景的交叉注意影響區域。此外,我們訓練生成網絡以姿勢為條件,以促進與姿勢的解耦。還採用了LoRA來增強保真度和質量。實驗強調了我們方法的有效性。StoryMaker支持眾多應用,並與其他社會插件兼容。我們的源代碼和模型權重可在https://github.com/RedAIGC/StoryMaker 上獲得。
English
Tuning-free personalized image generation methods have achieved significant
success in maintaining facial consistency, i.e., identities, even with multiple
characters. However, the lack of holistic consistency in scenes with multiple
characters hampers these methods' ability to create a cohesive narrative. In
this paper, we introduce StoryMaker, a personalization solution that preserves
not only facial consistency but also clothing, hairstyles, and body
consistency, thus facilitating the creation of a story through a series of
images. StoryMaker incorporates conditions based on face identities and cropped
character images, which include clothing, hairstyles, and bodies. Specifically,
we integrate the facial identity information with the cropped character images
using the Positional-aware Perceiver Resampler (PPR) to obtain distinct
character features. To prevent intermingling of multiple characters and the
background, we separately constrain the cross-attention impact regions of
different characters and the background using MSE loss with segmentation masks.
Additionally, we train the generation network conditioned on poses to promote
decoupling from poses. A LoRA is also employed to enhance fidelity and quality.
Experiments underscore the effectiveness of our approach. StoryMaker supports
numerous applications and is compatible with other societal plug-ins. Our
source codes and model weights are available at
https://github.com/RedAIGC/StoryMaker.Summary
AI-Generated Summary