Induktives Momentenabgleich

Zusammenfassung

Diffusionsmodelle und Flow Matching erzeugen hochwertige Stichproben, sind jedoch bei der Inferenz langsam, und ihre Destillation in Modelle mit wenigen Schritten führt oft zu Instabilität und umfangreicher Feinabstimmung. Um diese Kompromisse zu lösen, schlagen wir Inductive Moment Matching (IMM) vor, eine neue Klasse von generativen Modellen für die Stichprobenentnahme in einem oder wenigen Schritten mit einem einstufigen Trainingsverfahren. Im Gegensatz zur Destillation erfordert IMM keine Vorabinitialisierung und Optimierung zweier Netzwerke; und im Gegensatz zu Consistency Models garantiert IMM die Konvergenz auf Verteilungsebene und bleibt unter verschiedenen Hyperparametern und Standardmodellarchitekturen stabil. IMM übertrifft Diffusionsmodelle auf ImageNet-256x256 mit einem FID von 1,99 bei nur 8 Inferenzschritten und erreicht einen state-of-the-art 2-Schritt-FID von 1,98 auf CIFAR-10 für ein Modell, das von Grund auf trainiert wurde.

English

Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning. To resolve these trade-offs, we propose Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure. Unlike distillation, IMM does not require pre-training initialization and optimization of two networks; and unlike Consistency Models, IMM guarantees distribution-level convergence and remains stable under various hyperparameters and standard model architectures. IMM surpasses diffusion models on ImageNet-256x256 with 1.99 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 1.98 on CIFAR-10 for a model trained from scratch.

Induktives Momentenabgleich

Inductive Moment Matching

Zusammenfassung

Summary

Support

Support