ChatPaper.aiChatPaper

HumanMM:多镜头视频中的全局人体运动恢复

HumanMM: Global Human Motion Recovery from Multi-shot Videos

March 10, 2025
作者: Yuhong Zhang, Guanlin Wu, Ling-Hao Chen, Zhuokai Zhao, Jing Lin, Xiaoke Jiang, Jiamin Wu, Zhuoheng Li, Hao Frank Yang, Haoqian Wang, Lei Zhang
cs.AI

摘要

本文提出了一种新颖的框架,旨在从包含多次镜头切换的野外视频中重建世界坐标系下的长序列3D人体运动。此类长序列野外运动对于动作生成与理解等应用具有重要价值,但由于视频中存在的镜头突变、部分遮挡及动态背景等因素,其重建面临巨大挑战。现有方法主要集中于单镜头视频,即在单一摄像机视角下保持连续性,或仅在摄像机空间简化多镜头对齐。本工作通过整合增强的摄像机姿态估计与人体运动恢复(HMR),引入镜头切换检测器及鲁棒对齐模块,以跨镜头精确保持姿态与方向连续性。借助定制化的运动积分器,我们有效缓解了脚部滑动问题,确保了人体姿态的时间一致性。基于公开3D人体数据集构建的多镜头数据集上的广泛评估,验证了本方法在世界坐标系下重建逼真人体运动的鲁棒性。
English
In this paper, we present a novel framework designed to reconstruct long-sequence 3D human motion in the world coordinates from in-the-wild videos with multiple shot transitions. Such long-sequence in-the-wild motions are highly valuable to applications such as motion generation and motion understanding, but are of great challenge to be recovered due to abrupt shot transitions, partial occlusions, and dynamic backgrounds presented in such videos. Existing methods primarily focus on single-shot videos, where continuity is maintained within a single camera view, or simplify multi-shot alignment in camera space only. In this work, we tackle the challenges by integrating an enhanced camera pose estimation with Human Motion Recovery (HMR) by incorporating a shot transition detector and a robust alignment module for accurate pose and orientation continuity across shots. By leveraging a custom motion integrator, we effectively mitigate the problem of foot sliding and ensure temporal consistency in human pose. Extensive evaluations on our created multi-shot dataset from public 3D human datasets demonstrate the robustness of our method in reconstructing realistic human motion in world coordinates.

Summary

AI-Generated Summary

PDF21March 11, 2025