ChatPaper.aiChatPaper

PerCoV2:基于隐式分层掩码图像建模的改进型超低比特率感知图像压缩

PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling

March 12, 2025
作者: Nikolai Körber, Eduard Kromer, Andreas Siebert, Sascha Hauke, Daniel Mueller-Gritschneder, Björn Schuller
cs.AI

摘要

我们推出PerCoV2,这是一种新颖且开源的超低比特率感知图像压缩系统,专为带宽和存储受限的应用场景设计。在Careil等人先前工作的基础上,PerCoV2将原有框架扩展至Stable Diffusion 3生态系统,并通过显式建模离散超潜在图像分布,提升了熵编码效率。为此,我们对近期自回归方法(VAR与MaskGIT)在熵建模方面进行了全面比较,并在大规模MSCOCO-30k基准上评估了我们的方法。相较于以往研究,PerCoV2具有以下优势:(i) 在更低比特率下实现更高的图像保真度,同时保持竞争力的感知质量;(ii) 引入混合生成模式以进一步节省比特率;(iii) 完全基于公开组件构建。代码及训练模型将在https://github.com/Nikolai10/PerCoV2 发布。
English
We introduce PerCoV2, a novel and open ultra-low bit-rate perceptual image compression system designed for bandwidth- and storage-constrained applications. Building upon prior work by Careil et al., PerCoV2 extends the original formulation to the Stable Diffusion 3 ecosystem and enhances entropy coding efficiency by explicitly modeling the discrete hyper-latent image distribution. To this end, we conduct a comprehensive comparison of recent autoregressive methods (VAR and MaskGIT) for entropy modeling and evaluate our approach on the large-scale MSCOCO-30k benchmark. Compared to previous work, PerCoV2 (i) achieves higher image fidelity at even lower bit-rates while maintaining competitive perceptual quality, (ii) features a hybrid generation mode for further bit-rate savings, and (iii) is built solely on public components. Code and trained models will be released at https://github.com/Nikolai10/PerCoV2.

Summary

AI-Generated Summary

PDF11March 14, 2025