移動於2D空間:以2D條件生成人體動作

Move-in-2D: 2D-Conditioned Human Motion Generation

December 17, 2024
作者: Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang, Difan Liu, Feng Liu, Ming-Hsuan Yang, Zhan Xu
cs.AI

摘要

生成逼真的人類影片仍然是一項具有挑戰性的任務,目前最有效的方法通常依賴於人類運動序列作為控制信號。現有方法通常使用從其他影片中提取的現有運動,這限制了應用於特定運動類型和全局場景匹配。我們提出了一種新方法 Move-in-2D,用於生成以場景圖像為條件的人類運動序列,從而產生適應不同場景的多樣運動。我們的方法利用擴散模型,接受場景圖像和文本提示作為輸入,生成適合該場景的運動序列。為了訓練這個模型,我們收集了一個大規模的單人活動視頻數據集,將每個視頻與相應的人體運動進行標註,作為目標輸出。實驗表明,我們的方法有效地預測了與場景圖像對齊的人體運動,並且我們展示了生成的運動序列在視頻合成任務中改善了人體運動質量。
English
Generating realistic human videos remains a challenging task, with the most effective methods currently relying on a human motion sequence as a control signal. Existing approaches often use existing motion extracted from other videos, which restricts applications to specific motion types and global scene matching. We propose Move-in-2D, a novel approach to generate human motion sequences conditioned on a scene image, allowing for diverse motion that adapts to different scenes. Our approach utilizes a diffusion model that accepts both a scene image and text prompt as inputs, producing a motion sequence tailored to the scene. To train this model, we collect a large-scale video dataset featuring single-human activities, annotating each video with the corresponding human motion as the target output. Experiments demonstrate that our method effectively predicts human motion that aligns with the scene image after projection. Furthermore, we show that the generated motion sequence improves human motion quality in video synthesis tasks.

Summary

AI-Generated Summary

PDF22December 20, 2024