Ferret:用於大型語言模型的規模化聯合全參數調整
Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models
September 10, 2024
作者: Yao Shu, Wenyang Hu, See-Kiong Ng, Bryan Kian Hsiang Low, Fei Richard Yu
cs.AI
摘要
大型語言模型(LLMs)已成為許多現實世界應用中不可或缺的。不幸的是,在規模上對這些模型進行微調,尤其是在聯邦設置中,其中數據隱私和通信效率至關重要,會帶來重大挑戰。現有方法通常採用參數高效微調(PEFT)來減輕通信開銷,但這通常是以模型準確性為代價的。為了解決這些限制,我們提出了用於大型語言模型的規模化聯邦全參數調整(Ferret),這是第一個具有共享隨機性的一階方法,可實現跨分散數據來源的大型語言模型的可擴展全參數調整,同時保持競爭力模型準確性。Ferret 通過三個方面實現這一目標:(1)它採用廣泛應用的一階方法進行高效的本地更新;(2)將這些更新投影到低維空間中,從而大幅減少通信開銷;(3)通過共享隨機性從這個低維空間重建本地更新,以促進有效的全參數全局聚合,確保快速收斂和競爭性最終性能。我們的嚴格理論分析和見解以及廣泛實驗表明,Ferret 通過實現高計算效率、降低通信開銷和快速收斂,同時保持競爭性模型準確性,顯著提高了現有聯邦全參數調整方法的可擴展性。我們的實現可在 https://github.com/allen4747/Ferret 上找到。
English
Large Language Models (LLMs) have become indispensable in numerous real-world
applications. Unfortunately, fine-tuning these models at scale, especially in
federated settings where data privacy and communication efficiency are
critical, presents significant challenges. Existing methods often resort to
parameter-efficient fine-tuning (PEFT) to mitigate communication overhead, but
this typically comes at the cost of model accuracy. To address these
limitations, we propose federated full-parameter tuning at scale for LLMs
(Ferret), the first first-order method with shared randomness to enable
scalable full-parameter tuning of LLMs across decentralized data sources while
maintaining competitive model accuracy. Ferret accomplishes this through three
aspects: (1) it employs widely applied first-order methods for efficient local
updates; (2) it projects these updates into a low-dimensional space to
considerably reduce communication overhead; and (3) it reconstructs local
updates from this low-dimensional space with shared randomness to facilitate
effective full-parameter global aggregation, ensuring fast convergence and
competitive final performance. Our rigorous theoretical analyses and insights
along with extensive experiments, show that Ferret significantly enhances the
scalability of existing federated full-parameter tuning approaches by achieving
high computational efficiency, reduced communication overhead, and fast
convergence, all while maintaining competitive model accuracy. Our
implementation is available at https://github.com/allen4747/Ferret.Summary
AI-Generated Summary