ChatPaper.aiChatPaper

Cobra:基于广泛参考的高效线稿上色

Cobra: Efficient Line Art COlorization with BRoAder References

April 16, 2025
作者: Junhao Zhuang, Lingen Li, Xuan Ju, Zhaoyang Zhang, Chun Yuan, Ying Shan
cs.AI

摘要

漫画制作行业对基于参考的线稿上色提出了高精度、高效率、上下文一致性及灵活控制的要求。一幅漫画页面通常包含多样化的角色、物体和背景,这使上色过程变得复杂。尽管扩散模型在图像生成领域取得了进展,但其在线稿上色中的应用仍受限,面临处理大量参考图像、耗时的推理过程及灵活控制等挑战。我们探讨了广泛上下文图像指导对线稿上色质量的重要性。为应对这些挑战,我们提出了Cobra,一种高效且多功能的方法,支持颜色提示并利用超过200张参考图像,同时保持低延迟。Cobra的核心是因果稀疏DiT架构,它利用特别设计的位置编码、因果稀疏注意力机制及键值缓存,有效管理长上下文参考并确保色彩一致性。结果表明,Cobra通过广泛的上下文参考实现了精确的线稿上色,显著提升了推理速度与交互性,从而满足了行业的关键需求。我们在项目页面发布了代码与模型:https://zhuang2002.github.io/Cobra/。
English
The comic production industry requires reference-based line art colorization with high accuracy, efficiency, contextual consistency, and flexible control. A comic page often involves diverse characters, objects, and backgrounds, which complicates the coloring process. Despite advancements in diffusion models for image generation, their application in line art colorization remains limited, facing challenges related to handling extensive reference images, time-consuming inference, and flexible control. We investigate the necessity of extensive contextual image guidance on the quality of line art colorization. To address these challenges, we introduce Cobra, an efficient and versatile method that supports color hints and utilizes over 200 reference images while maintaining low latency. Central to Cobra is a Causal Sparse DiT architecture, which leverages specially designed positional encodings, causal sparse attention, and Key-Value Cache to effectively manage long-context references and ensure color identity consistency. Results demonstrate that Cobra achieves accurate line art colorization through extensive contextual reference, significantly enhancing inference speed and interactivity, thereby meeting critical industrial demands. We release our codes and models on our project page: https://zhuang2002.github.io/Cobra/.

Summary

AI-Generated Summary

PDF272April 17, 2025