PerCoV2:基于隐式分层掩码图像建模的改进型超低比特率感知图像压缩
PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling
March 12, 2025
作者: Nikolai Körber, Eduard Kromer, Andreas Siebert, Sascha Hauke, Daniel Mueller-Gritschneder, Björn Schuller
cs.AI
摘要
我们推出PerCoV2,这是一种新颖且开源的超低比特率感知图像压缩系统,专为带宽和存储受限的应用场景设计。在Careil等人先前工作的基础上,PerCoV2将原有框架扩展至Stable Diffusion 3生态系统,并通过显式建模离散超潜在图像分布,提升了熵编码效率。为此,我们对近期自回归方法(VAR与MaskGIT)在熵建模方面进行了全面比较,并在大规模MSCOCO-30k基准上评估了我们的方法。相较于以往研究,PerCoV2具有以下优势:(i) 在更低比特率下实现更高的图像保真度,同时保持竞争力的感知质量;(ii) 引入混合生成模式以进一步节省比特率;(iii) 完全基于公开组件构建。代码及训练模型将在https://github.com/Nikolai10/PerCoV2 发布。
English
We introduce PerCoV2, a novel and open ultra-low bit-rate perceptual image
compression system designed for bandwidth- and storage-constrained
applications. Building upon prior work by Careil et al., PerCoV2 extends the
original formulation to the Stable Diffusion 3 ecosystem and enhances entropy
coding efficiency by explicitly modeling the discrete hyper-latent image
distribution. To this end, we conduct a comprehensive comparison of recent
autoregressive methods (VAR and MaskGIT) for entropy modeling and evaluate our
approach on the large-scale MSCOCO-30k benchmark. Compared to previous work,
PerCoV2 (i) achieves higher image fidelity at even lower bit-rates while
maintaining competitive perceptual quality, (ii) features a hybrid generation
mode for further bit-rate savings, and (iii) is built solely on public
components. Code and trained models will be released at
https://github.com/Nikolai10/PerCoV2.Summary
AI-Generated Summary