자연어로부터 물리적으로 현실적인 비디오 편집을 하는 AutoVFX: 지침

초록

현대 시각 효과 (VFX) 소프트웨어는 숙련된 예술가들이 거의 모든 것의 이미지를 만들 수 있게 만들었습니다. 그러나, 창작 과정은 여전히 수고롭고 복잡하며 대중에게는 대부분 접근하기 어렵습니다. 본 연구에서는 AutoVFX를 제안합니다. 이는 단일 비디오와 자연어 지시사항에서 실제감 있고 동적인 VFX 비디오를 자동으로 생성하는 프레임워크입니다. 신경 씬 모델링, LLM 기반 코드 생성, 물리 시뮬레이션을 신중하게 통합함으로써 AutoVFX는 물리적으로 기반을 둔 사실적인 편집 효과를 제공할 수 있으며 이를 자연어 지시사항을 통해 직접 제어할 수 있습니다. 우리는 AutoVFX의 효과를 다양한 비디오와 지시사항을 통해 검증하기 위해 광범위한 실험을 수행합니다. 양적 및 질적 결과는 AutoVFX가 생성적 품질, 지시사항 정렬, 편집 다양성 및 물리적 타당성에서 경쟁하는 모든 방법을 큰 폭으로 능가한다는 것을 시사합니다.

English

Modern visual effects (VFX) software has made it possible for skilled artists to create imagery of virtually anything. However, the creation process remains laborious, complex, and largely inaccessible to everyday users. In this work, we present AutoVFX, a framework that automatically creates realistic and dynamic VFX videos from a single video and natural language instructions. By carefully integrating neural scene modeling, LLM-based code generation, and physical simulation, AutoVFX is able to provide physically-grounded, photorealistic editing effects that can be controlled directly using natural language instructions. We conduct extensive experiments to validate AutoVFX's efficacy across a diverse spectrum of videos and instructions. Quantitative and qualitative results suggest that AutoVFX outperforms all competing methods by a large margin in generative quality, instruction alignment, editing versatility, and physical plausibility.

자연어로부터 물리적으로 현실적인 비디오 편집을 하는 AutoVFX: 지침

AutoVFX: Physically Realistic Video Editing from Natural Language Instructions

초록

Support