GenDoP:作为摄影指导的自回归相机轨迹生成
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography
April 9, 2025
作者: Mengchen Zhang, Tong Wu, Jing Tan, Ziwei Liu, Gordon Wetzstein, Dahua Lin
cs.AI
摘要
摄像机轨迹设计在视频制作中扮演着至关重要的角色,它是传达导演意图和增强视觉叙事的基本工具。在电影摄影中,摄影师们精心设计摄像机运动,以实现富有表现力和意图的构图。然而,现有的摄像机轨迹生成方法仍存在局限:传统方法依赖于几何优化或手工制作的程序化系统,而最近基于学习的方法往往继承了结构性偏差或缺乏文本对齐,限制了创意合成。在本研究中,我们引入了一种受摄影师专业知识启发的自回归模型,用于生成艺术性和表现力丰富的摄像机轨迹。我们首先介绍了DataDoP,这是一个大规模多模态数据集,包含29,000个真实世界的镜头,涵盖自由移动的摄像机轨迹、深度图以及关于特定运动、场景互动和导演意图的详细描述。得益于这一全面且多样化的数据库,我们进一步训练了一个仅解码器的自回归Transformer,名为GenDoP,用于基于文本引导和RGBD输入的高质量、上下文感知的摄像机运动生成。大量实验表明,与现有方法相比,GenDoP提供了更好的可控性、更细粒度的轨迹调整以及更高的运动稳定性。我们相信,我们的方法为基于学习的电影摄影设立了新标准,为未来摄像机控制和电影制作的进步铺平了道路。我们的项目网站:https://kszpxxzmc.github.io/GenDoP/。
English
Camera trajectory design plays a crucial role in video production, serving as
a fundamental tool for conveying directorial intent and enhancing visual
storytelling. In cinematography, Directors of Photography meticulously craft
camera movements to achieve expressive and intentional framing. However,
existing methods for camera trajectory generation remain limited: Traditional
approaches rely on geometric optimization or handcrafted procedural systems,
while recent learning-based methods often inherit structural biases or lack
textual alignment, constraining creative synthesis. In this work, we introduce
an auto-regressive model inspired by the expertise of Directors of Photography
to generate artistic and expressive camera trajectories. We first introduce
DataDoP, a large-scale multi-modal dataset containing 29K real-world shots with
free-moving camera trajectories, depth maps, and detailed captions in specific
movements, interaction with the scene, and directorial intent. Thanks to the
comprehensive and diverse database, we further train an auto-regressive,
decoder-only Transformer for high-quality, context-aware camera movement
generation based on text guidance and RGBD inputs, named GenDoP. Extensive
experiments demonstrate that compared to existing methods, GenDoP offers better
controllability, finer-grained trajectory adjustments, and higher motion
stability. We believe our approach establishes a new standard for
learning-based cinematography, paving the way for future advancements in camera
control and filmmaking. Our project website:
https://kszpxxzmc.github.io/GenDoP/.Summary
AI-Generated Summary