低秩适应中的角度与强度解耦

摘要

参数高效微调（PEFT）方法近期因大规模预训练模型的广泛普及而备受关注。这些方法能够以最小的计算成本快速适应下游任务。然而，诸如LoRA等流行的微调方法在超参数选择或长时间训练方案方面表现出有限的鲁棒性，阻碍了其开箱即用的最优性能。相比之下，ETHER等有界方法虽提供了更高的鲁棒性，但仅限于极低秩的适应和固定强度的变换，从而降低了其适应表达能力。在本研究中，我们提出了一种新型微调方法——解耦低秩适应（DeLoRA），该方法通过归一化和缩放可学习的低秩矩阵，有效界定了变换的距离，从而将角度学习与适应强度解耦，在不影响性能的前提下增强了鲁棒性。通过在主题驱动图像生成、自然语言理解和指令调优等任务上的评估，我们展示了DeLoRA在性能上匹配或超越了其他PEFT方法，同时展现出更强的鲁棒性。代码已发布于https://github.com/ExplainableML/DeLoRA。

English

Parameter-Efficient FineTuning (PEFT) methods have recently gained significant popularity thanks to the widespread availability of large-scale pretrained models. These methods allow for quick adaptation to downstream tasks with minimal computational cost. However, popular finetuning methods such as LoRA exhibit limited robustness when it comes to hyperparameter choices or extended training regimes, preventing optimal out-of-the-box performance. In contrast, bounded approaches, such as ETHER, provide greater robustness but are limited to extremely low-rank adaptations and fixed-strength transformations, reducing their adaptation expressive power. In this work, we propose Decoupled Low-rank Adaptation (DeLoRA), a novel finetuning method that normalizes and scales learnable low-rank matrices. By bounding the distance of the transformation, DeLoRA effectively decouples the angular learning from the adaptation strength, enhancing robustness without compromising performance. Through evaluations on subject-driven image generation, natural language understanding, and instruction tuning, we show that DeLoRA matches or surpasses performance of competing PEFT methods, while exhibiting stronger robustness. Code is available at https://github.com/ExplainableML/DeLoRA.

低秩适应中的角度与强度解耦

Decoupling Angles and Strength in Low-rank Adaptation

摘要

Summary

Support

Support