ChatPaper.aiChatPaper

WORLDMEM:基於記憶的長期一致性世界模擬

WORLDMEM: Long-term Consistent World Simulation with Memory

April 16, 2025
作者: Zeqi Xiao, Yushi Lan, Yifan Zhou, Wenqi Ouyang, Shuai Yang, Yanhong Zeng, Xingang Pan
cs.AI

摘要

世界模擬因其能夠建模虛擬環境並預測行動後果而日益受到關注。然而,有限的時間上下文窗口常常導致在保持長期一致性方面出現問題,特別是在維持3D空間一致性方面。在本研究中,我們提出了WorldMem框架,該框架通過一個由記憶單元組成的記憶庫來增強場景生成,這些記憶單元存儲記憶幀和狀態(例如,姿態和時間戳)。通過採用一種記憶注意力機制,能夠根據這些記憶幀的狀態有效地提取相關信息,我們的方法即使在顯著的視角或時間間隔下,也能準確重建先前觀察到的場景。此外,通過將時間戳納入狀態中,我們的框架不僅能模擬靜態世界,還能捕捉其隨時間的動態演變,從而實現模擬世界中的感知與互動。在虛擬和真實場景中的大量實驗驗證了我們方法的有效性。
English
World simulation has gained increasing popularity due to its ability to model virtual environments and predict the consequences of actions. However, the limited temporal context window often leads to failures in maintaining long-term consistency, particularly in preserving 3D spatial consistency. In this work, we present WorldMem, a framework that enhances scene generation with a memory bank consisting of memory units that store memory frames and states (e.g., poses and timestamps). By employing a memory attention mechanism that effectively extracts relevant information from these memory frames based on their states, our method is capable of accurately reconstructing previously observed scenes, even under significant viewpoint or temporal gaps. Furthermore, by incorporating timestamps into the states, our framework not only models a static world but also captures its dynamic evolution over time, enabling both perception and interaction within the simulated world. Extensive experiments in both virtual and real scenarios validate the effectiveness of our approach.

Summary

AI-Generated Summary

PDF272April 18, 2025