In-2-4D:從兩張單視圖圖像到四維生成的過渡動畫生成
In-2-4D: Inbetweening from Two Single-View Images to 4D Generation
April 11, 2025
作者: Sauradip Nag, Daniel Cohen-Or, Hao Zhang, Ali Mahdavi-Amiri
cs.AI
摘要
我們提出了一個新問題——In-2-4D,旨在從極簡的輸入設置中生成四維(即三維加運動)的插幀效果:僅需兩張單視圖圖像,捕捉物體在兩個不同運動狀態下的瞬間。給定代表物體運動起止狀態的兩張圖像,我們的目標是生成並重建其四維運動軌跡。我們利用視頻插值模型來預測運動,但幀間的大幅度運動可能導致解釋上的模糊性。為此,我們採用分層方法,識別出視覺上接近輸入狀態且展現顯著運動的關鍵幀,然後在這些關鍵幀之間生成平滑的片段。對於每個片段,我們使用高斯潑濺技術構建關鍵幀的三維表示。片段內的時序幀引導運動,通過變形場將其轉化為動態高斯分佈。為了提升時間一致性並精煉三維運動,我們擴展了多視角擴散模型在時間步上的自注意力機制,並應用剛體變換正則化。最後,我們通過插值邊界變形場並優化其與引導視頻的對齊,合併獨立生成的三維運動片段,確保過渡平滑無閃爍。通過大量的定性定量實驗及用戶研究,我們展示了該方法及其各組成部分的有效性。項目頁面可訪問:https://in-2-4d.github.io/。
English
We propose a new problem, In-2-4D, for generative 4D (i.e., 3D + motion)
inbetweening from a minimalistic input setting: two single-view images
capturing an object in two distinct motion states. Given two images
representing the start and end states of an object in motion, our goal is to
generate and reconstruct the motion in 4D. We utilize a video interpolation
model to predict the motion, but large frame-to-frame motions can lead to
ambiguous interpretations. To overcome this, we employ a hierarchical approach
to identify keyframes that are visually close to the input states and show
significant motion, then generate smooth fragments between them. For each
fragment, we construct the 3D representation of the keyframe using Gaussian
Splatting. The temporal frames within the fragment guide the motion, enabling
their transformation into dynamic Gaussians through a deformation field. To
improve temporal consistency and refine 3D motion, we expand the self-attention
of multi-view diffusion across timesteps and apply rigid transformation
regularization. Finally, we merge the independently generated 3D motion
segments by interpolating boundary deformation fields and optimizing them to
align with the guiding video, ensuring smooth and flicker-free transitions.
Through extensive qualitative and quantitiave experiments as well as a user
study, we show the effectiveness of our method and its components. The project
page is available at https://in-2-4d.github.io/Summary
AI-Generated Summary