MotionLab：通过动作-条件-动作范式实现统一的人体动作生成和编辑

摘要

人类动作生成和编辑是计算机图形学和视觉的关键组成部分。然而，当前在这一领域的方法往往提供针对特定任务量身定制的孤立解决方案，这可能对实际应用效率低且不切实际。虽然一些努力旨在统一与动作相关的任务，但这些方法仅仅使用不同形式作为条件来引导动作生成。因此，它们缺乏编辑能力、细粒度控制，并且未能促进跨任务的知识共享。为了解决这些限制并提供一个能够处理人类动作生成和编辑的多功能统一框架，我们引入了一种新范式：动作-条件-动作，它能够统一表达多样的任务，包括三个概念：源动作、条件和目标动作。基于这一范式，我们提出了一个统一框架MotionLab，它结合了矫正流来学习从源动作到目标动作的映射，由指定条件引导。在MotionLab中，我们引入了1）MotionFlow Transformer来增强有条件的生成和编辑，而无需特定任务模块；2）对齐旋转位置编码以确保源动作和目标动作之间的时间同步；3）任务指定指令调制；以及4）动作课程学习，用于有效的多任务学习和跨任务的知识共享。值得注意的是，我们的MotionLab展示了在多个人类动作基准测试中具有良好的泛化能力和推理效率。我们的代码和额外的视频结果可在以下网址获取：https://diouo.github.io/motionlab.github.io/。

English

Human motion generation and editing are key components of computer graphics and vision. However, current approaches in this field tend to offer isolated solutions tailored to specific tasks, which can be inefficient and impractical for real-world applications. While some efforts have aimed to unify motion-related tasks, these methods simply use different modalities as conditions to guide motion generation. Consequently, they lack editing capabilities, fine-grained control, and fail to facilitate knowledge sharing across tasks. To address these limitations and provide a versatile, unified framework capable of handling both human motion generation and editing, we introduce a novel paradigm: Motion-Condition-Motion, which enables the unified formulation of diverse tasks with three concepts: source motion, condition, and target motion. Based on this paradigm, we propose a unified framework, MotionLab, which incorporates rectified flows to learn the mapping from source motion to target motion, guided by the specified conditions. In MotionLab, we introduce the 1) MotionFlow Transformer to enhance conditional generation and editing without task-specific modules; 2) Aligned Rotational Position Encoding} to guarantee the time synchronization between source motion and target motion; 3) Task Specified Instruction Modulation; and 4) Motion Curriculum Learning for effective multi-task learning and knowledge sharing across tasks. Notably, our MotionLab demonstrates promising generalization capabilities and inference efficiency across multiple benchmarks for human motion. Our code and additional video results are available at: https://diouo.github.io/motionlab.github.io/.

MotionLab：通过动作-条件-动作范式实现统一的人体动作生成和编辑

MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm

摘要

Summary

Support