ChatPaper.aiChatPaper

RewardSDS:通过奖励加权采样实现分数蒸馏对齐

RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling

March 12, 2025
作者: Itay Chachy, Guy Yariv, Sagie Benaim
cs.AI

摘要

分数蒸馏采样(Score Distillation Sampling, SDS)作为一种有效技术,已成功应用于利用二维扩散先验完成诸如文本到三维生成等任务。尽管功能强大,SDS在实现用户意图的精细对齐方面仍面临挑战。为此,我们提出了RewardSDS,一种创新方法,它依据奖励模型的对齐分数对噪声样本进行加权,从而生成加权的SDS损失函数。该损失函数优先考虑那些能产生高奖励对齐输出的噪声样本梯度。我们的方法具有广泛适用性,并能扩展基于SDS的技术。特别地,我们通过引入RewardVSD,展示了其在变分分数蒸馏(Variational Score Distillation, VSD)中的应用潜力。我们在文本到图像生成、二维编辑及文本到三维生成任务上对RewardSDS和RewardVSD进行了评估,结果表明,在衡量生成质量及与期望奖励模型对齐的多种指标上,相较于SDS和VSD,两者均展现出显著提升,实现了业界领先的性能。项目页面详见https://itaychachy.github.io/reward-sds/。
English
Score Distillation Sampling (SDS) has emerged as an effective technique for leveraging 2D diffusion priors for tasks such as text-to-3D generation. While powerful, SDS struggles with achieving fine-grained alignment to user intent. To overcome this, we introduce RewardSDS, a novel approach that weights noise samples based on alignment scores from a reward model, producing a weighted SDS loss. This loss prioritizes gradients from noise samples that yield aligned high-reward output. Our approach is broadly applicable and can extend SDS-based methods. In particular, we demonstrate its applicability to Variational Score Distillation (VSD) by introducing RewardVSD. We evaluate RewardSDS and RewardVSD on text-to-image, 2D editing, and text-to-3D generation tasks, showing significant improvements over SDS and VSD on a diverse set of metrics measuring generation quality and alignment to desired reward models, enabling state-of-the-art performance. Project page is available at https://itaychachy. github.io/reward-sds/.

Summary

AI-Generated Summary

PDF132March 13, 2025