DisPose:解開姿勢指導,以控制人類圖像動畫
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
December 12, 2024
作者: Hongxiang Li, Yaowei Li, Yuhang Yang, Junjie Cao, Zhihong Zhu, Xuxin Cheng, Long Chen
cs.AI
摘要
可控人類圖像動畫的目標是使用駕駛視頻從參考圖像生成視頻。由於稀疏引導(例如骨架姿勢)提供的控制信號有限,最近的研究試圖引入額外的密集條件(例如深度圖)來確保運動對齊。然而,當參考角色的身體形狀與駕駛視頻明顯不同時,這種嚴格的密集引導會損害生成視頻的質量。在本文中,我們提出DisPose,以挖掘更具普遍性和有效性的控制信號,而無需額外的密集輸入,將人類圖像動畫中的稀疏骨架姿勢解開為運動場指導和關鍵點對應。具體來說,我們從稀疏運動場和參考圖像生成一個密集運動場,提供區域級密集引導,同時保持稀疏姿勢控制的泛化性。我們還從參考圖像提取與姿勢關鍵點相對應的擴散特徵,然後將這些點特徵轉移到目標姿勢,以提供獨特的身份信息。為了無縫集成到現有模型中,我們提出了一個即插即用的混合ControlNet,它提高了生成視頻的質量和一致性,同時凍結現有模型參數。大量的定性和定量實驗證明了DisPose相對於當前方法的優越性。代碼:https://github.com/lihxxx/DisPose。
English
Controllable human image animation aims to generate videos from reference
images using driving videos. Due to the limited control signals provided by
sparse guidance (e.g., skeleton pose), recent works have attempted to introduce
additional dense conditions (e.g., depth map) to ensure motion alignment.
However, such strict dense guidance impairs the quality of the generated video
when the body shape of the reference character differs significantly from that
of the driving video. In this paper, we present DisPose to mine more
generalizable and effective control signals without additional dense input,
which disentangles the sparse skeleton pose in human image animation into
motion field guidance and keypoint correspondence. Specifically, we generate a
dense motion field from a sparse motion field and the reference image, which
provides region-level dense guidance while maintaining the generalization of
the sparse pose control. We also extract diffusion features corresponding to
pose keypoints from the reference image, and then these point features are
transferred to the target pose to provide distinct identity information. To
seamlessly integrate into existing models, we propose a plug-and-play hybrid
ControlNet that improves the quality and consistency of generated videos while
freezing the existing model parameters. Extensive qualitative and quantitative
experiments demonstrate the superiority of DisPose compared to current methods.
Code:
https://github.com/lihxxx/DisPose{https://github.com/lihxxx/DisPose}.Summary
AI-Generated Summary