CoMotion:多人同步三维运动捕捉
CoMotion: Concurrent Multi-person 3D Motion
April 16, 2025
作者: Alejandro Newell, Peiyun Hu, Lahav Lipson, Stephan R. Richter, Vladlen Koltun
cs.AI
摘要
我们提出了一种从单目摄像头流中检测并追踪多人精细三维姿态的方法。该系统能够在充满复杂姿态和遮挡的拥挤场景中保持时间上连贯的预测。我们的模型不仅实现了强大的逐帧检测,还通过学习的姿态更新来逐帧追踪人物。不同于跨时间匹配检测结果,姿态直接从新的输入图像中更新,从而实现了在遮挡情况下的在线追踪。我们利用大量图像和视频数据集进行训练,借助伪标注注释,打造出一个在三维姿态估计精度上媲美最先进系统的模型,同时在多人长时间追踪方面更为快速和准确。代码及权重已发布于https://github.com/apple/ml-comotion。
English
We introduce an approach for detecting and tracking detailed 3D poses of
multiple people from a single monocular camera stream. Our system maintains
temporally coherent predictions in crowded scenes filled with difficult poses
and occlusions. Our model performs both strong per-frame detection and a
learned pose update to track people from frame to frame. Rather than match
detections across time, poses are updated directly from a new input image,
which enables online tracking through occlusion. We train on numerous image and
video datasets leveraging pseudo-labeled annotations to produce a model that
matches state-of-the-art systems in 3D pose estimation accuracy while being
faster and more accurate in tracking multiple people through time. Code and
weights are provided at https://github.com/apple/ml-comotionSummary
AI-Generated Summary