Seg-Zero:基于认知强化的推理链引导分割
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement
March 9, 2025
作者: Yuqi Liu, Bohao Peng, Zhisheng Zhong, Zihao Yue, Fanbin Lu, Bei Yu, Jiaya Jia
cs.AI
摘要
传统的推理分割方法依赖于带有类别标签和简单描述的监督微调,这限制了其跨域泛化能力,并缺乏显式的推理过程。为解决这些局限,我们提出了Seg-Zero,这一新颖框架展现了卓越的泛化能力,并通过认知强化推导出显式的链式推理过程。Seg-Zero采用了一种解耦架构,包含一个推理模型和一个分割模型。推理模型负责解读用户意图,生成显式推理链,并产生位置提示,随后分割模型利用这些提示生成精确的像素级掩码。我们设计了一种复杂的奖励机制,结合格式和准确性奖励,有效引导优化方向。仅通过GRPO强化学习训练且无需显式推理数据,Seg-Zero实现了稳健的零样本泛化,并展现出在测试时涌现的推理能力。实验表明,Seg-Zero-7B在ReasonSeg基准测试中取得了57.5的零样本性能,较之前的LISA-7B提升了18%。这一显著改进凸显了Seg-Zero在跨域泛化能力上的优势,同时提供了显式的推理过程。代码已发布于https://github.com/dvlab-research/Seg-Zero。
English
Traditional methods for reasoning segmentation rely on supervised fine-tuning
with categorical labels and simple descriptions, limiting its out-of-domain
generalization and lacking explicit reasoning processes. To address these
limitations, we propose Seg-Zero, a novel framework that demonstrates
remarkable generalizability and derives explicit chain-of-thought reasoning
through cognitive reinforcement. Seg-Zero introduces a decoupled architecture
consisting of a reasoning model and a segmentation model. The reasoning model
interprets user intentions, generates explicit reasoning chains, and produces
positional prompts, which are subsequently used by the segmentation model to
generate precious pixel-level masks. We design a sophisticated reward mechanism
that integrates both format and accuracy rewards to effectively guide
optimization directions. Trained exclusively via reinforcement learning with
GRPO and without explicit reasoning data, Seg-Zero achieves robust zero-shot
generalization and exhibits emergent test-time reasoning capabilities.
Experiments show that Seg-Zero-7B achieves a zero-shot performance of 57.5 on
the ReasonSeg benchmark, surpassing the prior LISA-7B by 18\%. This significant
improvement highlights Seg-Zero's ability to generalize across domains while
presenting an explicit reasoning process. Code is available at
https://github.com/dvlab-research/Seg-Zero.Summary
AI-Generated Summary