V^3:透過可串流的2D動態高斯函數在行動裝置上觀看體積視頻
V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians
September 20, 2024
作者: Penghao Wang, Zhirui Zhang, Liao Wang, Kaixin Yao, Siyuan Xie, Jingyi Yu, Minye Wu, Lan Xu
cs.AI
摘要
體驗高保真體積影片和 2D 影片一樣流暢一直是一個夢想。然而,目前的動態 3DGS 方法,儘管具有高渲染質量,卻面臨在移動設備上流媒體播放的挑戰,這是由於計算和頻寬限制。在本文中,我們介紹了 V3(查看體積影片),這是一種新方法,通過動態高斯流的流媒體實現了高質量的移動渲染。我們的關鍵創新是將動態 3DGS 視為 2D 影片,從而便於使用硬件視頻編解碼器。此外,我們提出了一種兩階段訓練策略,通過快速訓練速度來減少存儲需求。第一階段採用哈希編碼和淺層 MLP 來學習運動,然後通過修剪減少高斯數量以滿足流媒體需求,同時第二階段通過殘差熵損失和時間損失來微調其他高斯屬性以改善時間連續性。這種策略將運動和外觀區分開來,保持了高渲染質量並具有緊湊的存儲需求。同時,我們設計了一個多平台播放器來解碼和渲染 2D 高斯影片。大量實驗證明了 V3 的有效性,通過在普通設備上實現高質量渲染和流媒體,勝過其他方法,這是前所未有的。作為首個在移動設備上流動動態高斯的人,我們的伴侶播放器為用戶提供了前所未有的體積影片體驗,包括流暢捲動和即時分享。我們的項目頁面和源代碼可在 https://authoritywang.github.io/v3/ 上找到。
English
Experiencing high-fidelity volumetric video as seamlessly as 2D videos is a
long-held dream. However, current dynamic 3DGS methods, despite their high
rendering quality, face challenges in streaming on mobile devices due to
computational and bandwidth constraints. In this paper, we introduce
V3(Viewing Volumetric Videos), a novel approach that enables
high-quality mobile rendering through the streaming of dynamic Gaussians. Our
key innovation is to view dynamic 3DGS as 2D videos, facilitating the use of
hardware video codecs. Additionally, we propose a two-stage training strategy
to reduce storage requirements with rapid training speed. The first stage
employs hash encoding and shallow MLP to learn motion, then reduces the number
of Gaussians through pruning to meet the streaming requirements, while the
second stage fine tunes other Gaussian attributes using residual entropy loss
and temporal loss to improve temporal continuity. This strategy, which
disentangles motion and appearance, maintains high rendering quality with
compact storage requirements. Meanwhile, we designed a multi-platform player to
decode and render 2D Gaussian videos. Extensive experiments demonstrate the
effectiveness of V3, outperforming other methods by enabling
high-quality rendering and streaming on common devices, which is unseen before.
As the first to stream dynamic Gaussians on mobile devices, our companion
player offers users an unprecedented volumetric video experience, including
smooth scrolling and instant sharing. Our project page with source code is
available at https://authoritywang.github.io/v3/.Summary
AI-Generated Summary