使用Itô密度估计器对扩散模型进行叠加
The Superposition of Diffusion Models Using the Itô Density Estimator
December 23, 2024
作者: Marta Skreta, Lazar Atanackovic, Avishek Joey Bose, Alexander Tong, Kirill Neklyudov
cs.AI
摘要
易于访问的预训练扩散模型的寒武纪爆发表明了对结合多个不同预训练扩散模型的方法的需求,而无需承担重新训练更大组合模型所带来的显著计算负担。在本文中,我们将在生成阶段将结合多个预训练扩散模型的问题,置于一个新提出的名为“叠加”的框架下。理论上,我们从著名的连续方程的严格第一原理中推导出叠加,并设计了两种专为在SuperDiff中结合扩散模型而量身定制的新算法。SuperDiff利用一种新的可扩展的It\^o密度估计器来估计扩散SDE的对数似然,与用于计算散度的著名Hutchinson估计器相比,不会产生额外开销。我们展示了SuperDiff可扩展到大型预训练扩散模型,因为叠加仅在推断过程中通过组合执行,而且在实现过程中也非常简便,通过自动重新加权方案将不同的预训练矢量场组合在一起。值得注意的是,我们展示了SuperDiff在推断时是高效的,并且模拟了传统的组合运算符,如逻辑OR和逻辑AND。我们在实证中展示了使用SuperDiff生成CIFAR-10上更多样化图像、使用稳定扩散进行更忠实的提示条件图像编辑以及改进的无条件全新蛋白质结构设计的效用。https://github.com/necludov/super-diffusion
English
The Cambrian explosion of easily accessible pre-trained diffusion models
suggests a demand for methods that combine multiple different pre-trained
diffusion models without incurring the significant computational burden of
re-training a larger combined model. In this paper, we cast the problem of
combining multiple pre-trained diffusion models at the generation stage under a
novel proposed framework termed superposition. Theoretically, we derive
superposition from rigorous first principles stemming from the celebrated
continuity equation and design two novel algorithms tailor-made for combining
diffusion models in SuperDiff. SuperDiff leverages a new scalable It\^o density
estimator for the log likelihood of the diffusion SDE which incurs no
additional overhead compared to the well-known Hutchinson's estimator needed
for divergence calculations. We demonstrate that SuperDiff is scalable to large
pre-trained diffusion models as superposition is performed solely through
composition during inference, and also enjoys painless implementation as it
combines different pre-trained vector fields through an automated re-weighting
scheme. Notably, we show that SuperDiff is efficient during inference time, and
mimics traditional composition operators such as the logical OR and the logical
AND. We empirically demonstrate the utility of using SuperDiff for generating
more diverse images on CIFAR-10, more faithful prompt conditioned image editing
using Stable Diffusion, and improved unconditional de novo structure design of
proteins. https://github.com/necludov/super-diffusionSummary
AI-Generated Summary