In-2-4D：從兩張單視圖圖像到四維生成的過渡動畫生成

摘要

我們提出了一個新問題——In-2-4D，旨在從極簡的輸入設置中生成四維（即三維加運動）的插幀效果：僅需兩張單視圖圖像，捕捉物體在兩個不同運動狀態下的瞬間。給定代表物體運動起止狀態的兩張圖像，我們的目標是生成並重建其四維運動軌跡。我們利用視頻插值模型來預測運動，但幀間的大幅度運動可能導致解釋上的模糊性。為此，我們採用分層方法，識別出視覺上接近輸入狀態且展現顯著運動的關鍵幀，然後在這些關鍵幀之間生成平滑的片段。對於每個片段，我們使用高斯潑濺技術構建關鍵幀的三維表示。片段內的時序幀引導運動，通過變形場將其轉化為動態高斯分佈。為了提升時間一致性並精煉三維運動，我們擴展了多視角擴散模型在時間步上的自注意力機制，並應用剛體變換正則化。最後，我們通過插值邊界變形場並優化其與引導視頻的對齊，合併獨立生成的三維運動片段，確保過渡平滑無閃爍。通過大量的定性定量實驗及用戶研究，我們展示了該方法及其各組成部分的有效性。項目頁面可訪問：https://in-2-4d.github.io/。

English

We propose a new problem, In-2-4D, for generative 4D (i.e., 3D + motion) inbetweening from a minimalistic input setting: two single-view images capturing an object in two distinct motion states. Given two images representing the start and end states of an object in motion, our goal is to generate and reconstruct the motion in 4D. We utilize a video interpolation model to predict the motion, but large frame-to-frame motions can lead to ambiguous interpretations. To overcome this, we employ a hierarchical approach to identify keyframes that are visually close to the input states and show significant motion, then generate smooth fragments between them. For each fragment, we construct the 3D representation of the keyframe using Gaussian Splatting. The temporal frames within the fragment guide the motion, enabling their transformation into dynamic Gaussians through a deformation field. To improve temporal consistency and refine 3D motion, we expand the self-attention of multi-view diffusion across timesteps and apply rigid transformation regularization. Finally, we merge the independently generated 3D motion segments by interpolating boundary deformation fields and optimizing them to align with the guiding video, ensuring smooth and flicker-free transitions. Through extensive qualitative and quantitiave experiments as well as a user study, we show the effectiveness of our method and its components. The project page is available at https://in-2-4d.github.io/

In-2-4D：從兩張單視圖圖像到四維生成的過渡動畫生成

In-2-4D: Inbetweening from Two Single-View Images to 4D Generation

摘要

Summary

Support

Support