使用Itô密度估计器对扩散模型进行叠加

The Superposition of Diffusion Models Using the Itô Density Estimator

December 23, 2024
作者: Marta Skreta, Lazar Atanackovic, Avishek Joey Bose, Alexander Tong, Kirill Neklyudov
cs.AI

摘要

易于访问的预训练扩散模型的寒武纪爆发表明了对结合多个不同预训练扩散模型的方法的需求,而无需承担重新训练更大组合模型所带来的显著计算负担。在本文中,我们将在生成阶段将结合多个预训练扩散模型的问题,置于一个新提出的名为“叠加”的框架下。理论上,我们从著名的连续方程的严格第一原理中推导出叠加,并设计了两种专为在SuperDiff中结合扩散模型而量身定制的新算法。SuperDiff利用一种新的可扩展的It\^o密度估计器来估计扩散SDE的对数似然,与用于计算散度的著名Hutchinson估计器相比,不会产生额外开销。我们展示了SuperDiff可扩展到大型预训练扩散模型,因为叠加仅在推断过程中通过组合执行,而且在实现过程中也非常简便,通过自动重新加权方案将不同的预训练矢量场组合在一起。值得注意的是,我们展示了SuperDiff在推断时是高效的,并且模拟了传统的组合运算符,如逻辑OR和逻辑AND。我们在实证中展示了使用SuperDiff生成CIFAR-10上更多样化图像、使用稳定扩散进行更忠实的提示条件图像编辑以及改进的无条件全新蛋白质结构设计的效用。https://github.com/necludov/super-diffusion
English
The Cambrian explosion of easily accessible pre-trained diffusion models suggests a demand for methods that combine multiple different pre-trained diffusion models without incurring the significant computational burden of re-training a larger combined model. In this paper, we cast the problem of combining multiple pre-trained diffusion models at the generation stage under a novel proposed framework termed superposition. Theoretically, we derive superposition from rigorous first principles stemming from the celebrated continuity equation and design two novel algorithms tailor-made for combining diffusion models in SuperDiff. SuperDiff leverages a new scalable It\^o density estimator for the log likelihood of the diffusion SDE which incurs no additional overhead compared to the well-known Hutchinson's estimator needed for divergence calculations. We demonstrate that SuperDiff is scalable to large pre-trained diffusion models as superposition is performed solely through composition during inference, and also enjoys painless implementation as it combines different pre-trained vector fields through an automated re-weighting scheme. Notably, we show that SuperDiff is efficient during inference time, and mimics traditional composition operators such as the logical OR and the logical AND. We empirically demonstrate the utility of using SuperDiff for generating more diverse images on CIFAR-10, more faithful prompt conditioned image editing using Stable Diffusion, and improved unconditional de novo structure design of proteins. https://github.com/necludov/super-diffusion

Summary

AI-Generated Summary

PDF122December 30, 2024