状态偏移调优:基于状态的状态空间模型参数高效微调
State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models
March 5, 2025
作者: Wonjun Kang, Kevin Galim, Yuchen Zeng, Minjae Lee, Hyung Il Koo, Nam Ik Cho
cs.AI
摘要
状态空间模型(SSMs)作为Transformer的高效替代方案崭露头角,有效缓解了其二次方计算成本的问题。然而,参数高效微调(PEFT)方法在SSMs上的应用仍鲜有探索。特别是,诸如提示调优和前缀调优等基于提示的方法,在Transformer中广泛使用,但在SSMs上表现欠佳。为此,我们提出基于状态的方法作为优于提示方法的替代方案。这一新方法家族自然源自SSMs的架构特性。基于状态的方法直接调整与状态相关的特征,而非依赖外部提示。此外,我们引入了一种新颖的基于状态的PEFT方法:状态偏移调优。在每一步时间点,我们的方法直接影响当前步骤的状态,从而实现更有效的适应。通过跨多个数据集的广泛实验,我们验证了该方法的有效性。代码可在https://github.com/furiosa-ai/ssm-state-tuning获取。
English
State Space Models (SSMs) have emerged as efficient alternatives to
Transformers, mitigating their quadratic computational cost. However, the
application of Parameter-Efficient Fine-Tuning (PEFT) methods to SSMs remains
largely unexplored. In particular, prompt-based methods like Prompt Tuning and
Prefix-Tuning, which are widely used in Transformers, do not perform well on
SSMs. To address this, we propose state-based methods as a superior alternative
to prompt-based methods. This new family of methods naturally stems from the
architectural characteristics of SSMs. State-based methods adjust state-related
features directly instead of depending on external prompts. Furthermore, we
introduce a novel state-based PEFT method: State-offset Tuning. At every
timestep, our method directly affects the state at the current step, leading to
more effective adaptation. Through extensive experiments across diverse
datasets, we demonstrate the effectiveness of our method. Code is available at
https://github.com/furiosa-ai/ssm-state-tuning.Summary
AI-Generated Summary