ChatPaper.aiChatPaper

CoMotion:多人同步三維動作捕捉

CoMotion: Concurrent Multi-person 3D Motion

April 16, 2025
作者: Alejandro Newell, Peiyun Hu, Lahav Lipson, Stephan R. Richter, Vladlen Koltun
cs.AI

摘要

我們提出了一種方法,用於從單一單目相機流中檢測並追蹤多人的精細三維姿態。我們的系統在充滿困難姿態和遮擋的擁擠場景中,能夠保持時間上連貫的預測。我們的模型不僅執行強大的逐幀檢測,還通過學習的姿態更新來逐幀追蹤人物。與跨時間匹配檢測不同,姿態直接從新的輸入圖像中更新,這使得在遮擋情況下也能進行線上追蹤。我們利用偽標註在多個圖像和視頻數據集上進行訓練,生成了一個在三維姿態估計準確度上與最先進系統相匹配的模型,同時在多人物追蹤方面更快且更準確。代碼和權重可在https://github.com/apple/ml-comotion獲取。
English
We introduce an approach for detecting and tracking detailed 3D poses of multiple people from a single monocular camera stream. Our system maintains temporally coherent predictions in crowded scenes filled with difficult poses and occlusions. Our model performs both strong per-frame detection and a learned pose update to track people from frame to frame. Rather than match detections across time, poses are updated directly from a new input image, which enables online tracking through occlusion. We train on numerous image and video datasets leveraging pseudo-labeled annotations to produce a model that matches state-of-the-art systems in 3D pose estimation accuracy while being faster and more accurate in tracking multiple people through time. Code and weights are provided at https://github.com/apple/ml-comotion

Summary

AI-Generated Summary

PDF32April 22, 2025