神經脈衝響應場的聲學體渲染

Acoustic Volume Rendering for Neural Impulse Response Fields

November 9, 2024
作者: Zitong Lan, Chenhao Zheng, Zhiwei Zheng, Mingmin Zhao
cs.AI

摘要

為了在虛擬和擴增實境中創造身臨其境的體驗,捕捉準確的聲學現象的逼真音頻合成至關重要。合成在任何位置接收到的聲音依賴於脈衝響應(IR)的估計,該響應描述聲音在一個場景中沿著不同路徑傳播到聽眾位置之前的情況。在本文中,我們提出聲學體積渲染(AVR),這是一種將體積渲染技術應用於建模聲學脈衝響應的新方法。雖然體積渲染在建模圖像和神經場景表示的輻射場方面取得了成功,但IRs作為時間序列信號提出了獨特的挑戰。為了應對這些挑戰,我們引入了頻域體積渲染並使用球形積分來擬合IR測量。我們的方法構建了一個脈衝響應場,內在編碼了波傳播原則,並在合成新姿勢的脈衝響應方面實現了最先進的性能。實驗表明AVR在很大程度上超越了當前領先的方法。此外,我們開發了一個聲學模擬平台AcoustiX,比現有的模擬器提供了更準確和逼真的IR模擬。AVR和AcoustiX的代碼可在https://zitonglan.github.io/avr 上找到。
English
Realistic audio synthesis that captures accurate acoustic phenomena is essential for creating immersive experiences in virtual and augmented reality. Synthesizing the sound received at any position relies on the estimation of impulse response (IR), which characterizes how sound propagates in one scene along different paths before arriving at the listener's position. In this paper, we present Acoustic Volume Rendering (AVR), a novel approach that adapts volume rendering techniques to model acoustic impulse responses. While volume rendering has been successful in modeling radiance fields for images and neural scene representations, IRs present unique challenges as time-series signals. To address these challenges, we introduce frequency-domain volume rendering and use spherical integration to fit the IR measurements. Our method constructs an impulse response field that inherently encodes wave propagation principles and achieves state-of-the-art performance in synthesizing impulse responses for novel poses. Experiments show that AVR surpasses current leading methods by a substantial margin. Additionally, we develop an acoustic simulation platform, AcoustiX, which provides more accurate and realistic IR simulations than existing simulators. Code for AVR and AcoustiX are available at https://zitonglan.github.io/avr.

Summary

AI-Generated Summary

PDF53November 13, 2024