1. 58位元FLUX

摘要

我們提出了1.58位元FLUX，這是第一個成功的方法，用於量化最先進的文本到圖像生成模型FLUX.1-dev，使用1.58位元權重（即值為{-1, 0, +1}），同時保持生成1024 x 1024圖像的可比性能。值得注意的是，我們的量化方法在沒有訪問圖像數據的情況下運作，僅依賴於FLUX.1-dev模型的自我監督。此外，我們開發了一個針對1.58位元操作進行優化的自定義核心，實現了模型存儲的7.7倍減少，推理內存的5.1倍減少，以及改進的推理延遲。在GenEval和T2I Compbench基準測試上進行了廣泛評估，證明了1.58位元FLUX在保持生成質量的同時顯著提高了計算效率。

English

We present 1.58-bit FLUX, the first successful approach to quantizing the state-of-the-art text-to-image generation model, FLUX.1-dev, using 1.58-bit weights (i.e., values in {-1, 0, +1}) while maintaining comparable performance for generating 1024 x 1024 images. Notably, our quantization method operates without access to image data, relying solely on self-supervision from the FLUX.1-dev model. Additionally, we develop a custom kernel optimized for 1.58-bit operations, achieving a 7.7x reduction in model storage, a 5.1x reduction in inference memory, and improved inference latency. Extensive evaluations on the GenEval and T2I Compbench benchmarks demonstrate the effectiveness of 1.58-bit FLUX in maintaining generation quality while significantly enhancing computational efficiency.

1. 58位元FLUX

1.58-bit FLUX

摘要

Summary

Support