SPARC:面向LLM的鲁棒持续学习的子空间感知提示适应
SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs
February 5, 2025
作者: Dinithi Jayasuriya, Sina Tayebati, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi
cs.AI
摘要
我们提出了SPARC,这是一个针对大型语言模型(LLMs)的轻量级持续学习框架,通过在较低维空间中进行提示调整实现高效的任务适应能力。通过利用主成分分析(PCA),我们确定了训练数据的一个紧凑子空间。在这个较低维空间中优化提示可以增强训练效率,因为它将更新集中在最相关的特征上,同时减少计算开销。此外,由于模型的内部结构保持不变,因此从预训练中获得的广泛知识得以完全保留,确保在适应过程中先前学到的信息不会受损。我们的方法在任务增量和领域增量的持续学习设置中实现了高知识保留率,同时仅微调了模型参数的0.04%。此外,通过集成LoRA,我们增强了对计算约束的适应性,实现了精度和训练成本之间的权衡。在SuperGLUE基准测试上的实验表明,我们基于PCA的提示调整结合LoRA可以保持完整的知识保留,同时提高准确性,仅利用了模型参数的1%。这些结果确立了我们的方法作为大型语言模型持续学习中的一种可扩展和资源高效的解决方案。
English
We propose SPARC, a lightweight continual learning framework for large
language models (LLMs) that enables efficient task adaptation through prompt
tuning in a lower-dimensional space. By leveraging principal component analysis
(PCA), we identify a compact subspace of the training data. Optimizing prompts
in this lower-dimensional space enhances training efficiency, as it focuses
updates on the most relevant features while reducing computational overhead.
Furthermore, since the model's internal structure remains unaltered, the
extensive knowledge gained from pretraining is fully preserved, ensuring that
previously learned information is not compromised during adaptation. Our method
achieves high knowledge retention in both task-incremental and
domain-incremental continual learning setups while fine-tuning only 0.04% of
the model's parameters. Additionally, by integrating LoRA, we enhance
adaptability to computational constraints, allowing for a tradeoff between
accuracy and training cost. Experiments on the SuperGLUE benchmark demonstrate
that our PCA-based prompt tuning combined with LoRA maintains full knowledge
retention while improving accuracy, utilizing only 1% of the model's
parameters. These results establish our approach as a scalable and
resource-efficient solution for continual learning in LLMs.Summary
AI-Generated Summary