ChatPaper.aiChatPaper

SphereDiff:基于球面潜在表示的无调优全向全景图像与视频生成

SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation

April 19, 2025
作者: Minho Park, Taewoong Kang, Jooyeol Yun, Sungwon Hwang, Jaegul Choo
cs.AI

摘要

随着AR/VR应用需求的日益增长,对高质量360度全景内容的需求也愈发凸显。然而,由于等距柱状投影(ERP)引入的严重失真,生成高质量的360度全景图像和视频仍是一项具有挑战性的任务。现有方法要么在有限的ERP数据集上微调预训练的扩散模型,要么尝试无需调优的方法,但这些方法仍依赖于ERP潜在表示,导致在极点附近出现不连续现象。本文提出了一种名为SphereDiff的新方法,它利用最先进的扩散模型,无需额外调优即可实现无缝的360度全景图像和视频生成。我们定义了一种球面潜在表示,确保所有视角上的均匀分布,从而缓解ERP固有的失真问题。我们将多扩散方法扩展至球面潜在空间,并提出了一种球面潜在采样方法,使得预训练扩散模型能够直接使用。此外,我们还引入了失真感知加权平均技术,以进一步提升投影过程中的生成质量。我们的方法在生成360度全景内容方面优于现有方法,同时保持了高保真度,为沉浸式AR/VR应用提供了一个稳健的解决方案。代码已公开,详见:https://github.com/pmh9960/SphereDiff。
English
The increasing demand for AR/VR applications has highlighted the need for high-quality 360-degree panoramic content. However, generating high-quality 360-degree panoramic images and videos remains a challenging task due to the severe distortions introduced by equirectangular projection (ERP). Existing approaches either fine-tune pretrained diffusion models on limited ERP datasets or attempt tuning-free methods that still rely on ERP latent representations, leading to discontinuities near the poles. In this paper, we introduce SphereDiff, a novel approach for seamless 360-degree panoramic image and video generation using state-of-the-art diffusion models without additional tuning. We define a spherical latent representation that ensures uniform distribution across all perspectives, mitigating the distortions inherent in ERP. We extend MultiDiffusion to spherical latent space and propose a spherical latent sampling method to enable direct use of pretrained diffusion models. Moreover, we introduce distortion-aware weighted averaging to further improve the generation quality in the projection process. Our method outperforms existing approaches in generating 360-degree panoramic content while maintaining high fidelity, making it a robust solution for immersive AR/VR applications. The code is available here. https://github.com/pmh9960/SphereDiff

Summary

AI-Generated Summary

PDF272April 22, 2025