1. 58比特 FLUX

摘要

我们提出了1.58比特FLUX，这是第一个成功的方法，用1.58比特的权重（即在{-1, 0, +1}中的值）量化最先进的文本到图像生成模型FLUX.1-dev，同时保持生成1024 x 1024图像的可比性能。值得注意的是，我们的量化方法在没有图像数据的情况下运行，仅依赖于FLUX.1-dev模型的自我监督。此外，我们开发了一个针对1.58比特操作进行优化的自定义内核，实现了模型存储的7.7倍减少，推理内存的5.1倍减少，以及改进的推理延迟。在GenEval和T2I Compbench基准测试上进行了广泛评估，证明了1.58比特FLUX在保持生成质量的同时显著提高了计算效率的有效性。

English

We present 1.58-bit FLUX, the first successful approach to quantizing the state-of-the-art text-to-image generation model, FLUX.1-dev, using 1.58-bit weights (i.e., values in {-1, 0, +1}) while maintaining comparable performance for generating 1024 x 1024 images. Notably, our quantization method operates without access to image data, relying solely on self-supervision from the FLUX.1-dev model. Additionally, we develop a custom kernel optimized for 1.58-bit operations, achieving a 7.7x reduction in model storage, a 5.1x reduction in inference memory, and improved inference latency. Extensive evaluations on the GenEval and T2I Compbench benchmarks demonstrate the effectiveness of 1.58-bit FLUX in maintaining generation quality while significantly enhancing computational efficiency.