StyleMe3D:基於多重編碼器與解耦先驗的三維高斯風格化
StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians
April 21, 2025
作者: Cailin Zhuang, Yaoqi Hu, Xuanyang Zhang, Wei Cheng, Jiacheng Bao, Shengqi Liu, Yiying Yang, Xianfang Zeng, Gang Yu, Ming Li
cs.AI
摘要
3D高斯潑濺(3DGS)在照片級真實場景重建方面表現卓越,但在處理風格化場景(如卡通、遊戲)時卻面臨挑戰,原因在於其紋理碎片化、語義不對齊以及對抽象美學的適應性有限。我們提出了StyleMe3D,這是一個用於3D高斯潑濺風格轉移的整體框架,它整合了多模態風格條件、多層次語義對齊以及感知質量增強。我們的核心見解包括:(1)僅優化RGB屬性可在風格化過程中保持幾何完整性;(2)分離低、中、高層次語義對於連貫的風格轉移至關重要;(3)在孤立物體和複雜場景中的可擴展性對於實際部署是必不可少的。StyleMe3D引入了四個新組件:動態風格分數蒸餾(DSSD),利用Stable Diffusion的潛在空間進行語義對齊;對比風格描述符(CSD)用於局部、內容感知的紋理轉移;同時優化的尺度(SOS)以解耦風格細節和結構連貫性;以及3D高斯質量評估(3DG-QA),這是一個基於人類評分數據訓練的可微分美學先驗,用於抑制偽影並增強視覺和諧。在NeRF合成數據集(物體)和tandt db(場景)數據集上的評估顯示,StyleMe3D在保留幾何細節(如雕塑上的雕刻)和確保場景間的風格一致性(如景觀中的連貫光照)方面優於最先進的方法,同時保持實時渲染。這項工作橋接了照片級真實的3D高斯潑濺與藝術風格化,為遊戲、虛擬世界和數字藝術開闢了新的應用前景。
English
3D Gaussian Splatting (3DGS) excels in photorealistic scene reconstruction
but struggles with stylized scenarios (e.g., cartoons, games) due to fragmented
textures, semantic misalignment, and limited adaptability to abstract
aesthetics. We propose StyleMe3D, a holistic framework for 3D GS style transfer
that integrates multi-modal style conditioning, multi-level semantic alignment,
and perceptual quality enhancement. Our key insights include: (1) optimizing
only RGB attributes preserves geometric integrity during stylization; (2)
disentangling low-, medium-, and high-level semantics is critical for coherent
style transfer; (3) scalability across isolated objects and complex scenes is
essential for practical deployment. StyleMe3D introduces four novel components:
Dynamic Style Score Distillation (DSSD), leveraging Stable Diffusion's latent
space for semantic alignment; Contrastive Style Descriptor (CSD) for localized,
content-aware texture transfer; Simultaneously Optimized Scale (SOS) to
decouple style details and structural coherence; and 3D Gaussian Quality
Assessment (3DG-QA), a differentiable aesthetic prior trained on human-rated
data to suppress artifacts and enhance visual harmony. Evaluated on NeRF
synthetic dataset (objects) and tandt db (scenes) datasets, StyleMe3D
outperforms state-of-the-art methods in preserving geometric details (e.g.,
carvings on sculptures) and ensuring stylistic consistency across scenes (e.g.,
coherent lighting in landscapes), while maintaining real-time rendering. This
work bridges photorealistic 3D GS and artistic stylization, unlocking
applications in gaming, virtual worlds, and digital art.Summary
AI-Generated Summary