ChatPaper.aiChatPaper

GenDoP:作為攝影指導的自回歸相機軌跡生成

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

April 9, 2025
作者: Mengchen Zhang, Tong Wu, Jing Tan, Ziwei Liu, Gordon Wetzstein, Dahua Lin
cs.AI

摘要

攝影機軌跡設計在影片製作中扮演著至關重要的角色,它是傳達導演意圖和增強視覺敘事的基本工具。在電影攝影中,攝影指導精心設計攝影機運動,以實現富有表現力和意圖明確的構圖。然而,現有的攝影機軌跡生成方法仍存在侷限性:傳統方法依賴於幾何優化或手工製作的程序系統,而最近的基於學習的方法往往繼承了結構偏見或缺乏文本對齊,限制了創意合成。在本研究中,我們引入了一種受攝影指導專業知識啟發的自迴歸模型,用於生成藝術且富有表現力的攝影機軌跡。我們首先介紹了DataDoP,這是一個大規模多模態數據集,包含29,000個真實世界的鏡頭,具有自由移動的攝影機軌跡、深度圖以及詳細的運動描述、場景互動和導演意圖。得益於這一全面且多樣化的數據庫,我們進一步訓練了一個僅解碼器的自迴歸Transformer,基於文本指導和RGBD輸入生成高質量、上下文感知的攝影機運動,命名為GenDoP。大量實驗表明,與現有方法相比,GenDoP提供了更好的可控性、更細粒度的軌跡調整以及更高的運動穩定性。我們相信,我們的方法為基於學習的電影攝影設立了新標準,為未來攝影機控制和電影製作的進步鋪平了道路。我們的項目網站:https://kszpxxzmc.github.io/GenDoP/。
English
Camera trajectory design plays a crucial role in video production, serving as a fundamental tool for conveying directorial intent and enhancing visual storytelling. In cinematography, Directors of Photography meticulously craft camera movements to achieve expressive and intentional framing. However, existing methods for camera trajectory generation remain limited: Traditional approaches rely on geometric optimization or handcrafted procedural systems, while recent learning-based methods often inherit structural biases or lack textual alignment, constraining creative synthesis. In this work, we introduce an auto-regressive model inspired by the expertise of Directors of Photography to generate artistic and expressive camera trajectories. We first introduce DataDoP, a large-scale multi-modal dataset containing 29K real-world shots with free-moving camera trajectories, depth maps, and detailed captions in specific movements, interaction with the scene, and directorial intent. Thanks to the comprehensive and diverse database, we further train an auto-regressive, decoder-only Transformer for high-quality, context-aware camera movement generation based on text guidance and RGBD inputs, named GenDoP. Extensive experiments demonstrate that compared to existing methods, GenDoP offers better controllability, finer-grained trajectory adjustments, and higher motion stability. We believe our approach establishes a new standard for learning-based cinematography, paving the way for future advancements in camera control and filmmaking. Our project website: https://kszpxxzmc.github.io/GenDoP/.

Summary

AI-Generated Summary

PDF212April 10, 2025