Transformer^2:自適應式LLMs
Transformer^2: Self-adaptive LLMs
January 9, 2025
作者: Qi Sun, Edoardo Cetin, Yujin Tang
cs.AI
摘要
自適應大型語言模型(LLMs)旨在解決傳統微調方法所帶來的挑戰,這些方法通常在處理多樣任務時需要大量計算資源且靜態性強。我們介紹了\implname,一個新穎的自適應框架,通過選擇性地調整其權重矩陣的單一組件,使LLMs能夠實時適應未知任務。在推論過程中,\implname採用兩過程機制:首先,一個調度系統識別任務屬性,然後使用強化學習訓練的任務特定“專家”向量被動態混合,以獲得針對輸入提示的目標行為。我們的方法在參數更少且效率更高的情況下勝過了常見方法,如LoRA。 \implname在不同的LLM架構和模態,包括視覺-語言任務中展現了多樣性。 \implname代表了一個重大飛躍,提供了一個可擴展、高效的解決方案,用於增強LLMs的適應性和任務特定性能,為真正動態、自組織的人工智能系統鋪平了道路。
English
Self-adaptive large language models (LLMs) aim to solve the challenges posed
by traditional fine-tuning methods, which are often computationally intensive
and static in their ability to handle diverse tasks. We introduce \implname, a
novel self-adaptation framework that adapts LLMs for unseen tasks in real-time
by selectively adjusting only the singular components of their weight matrices.
During inference, \implname employs a two-pass mechanism: first, a dispatch
system identifies the task properties, and then task-specific "expert" vectors,
trained using reinforcement learning, are dynamically mixed to obtain targeted
behavior for the incoming prompt. Our method outperforms ubiquitous approaches
such as LoRA, with fewer parameters and greater efficiency. \implname
demonstrates versatility across different LLM architectures and modalities,
including vision-language tasks. \implname represents a significant leap
forward, offering a scalable, efficient solution for enhancing the adaptability
and task-specific performance of LLMs, paving the way for truly dynamic,
self-organizing AI systems.Summary
AI-Generated Summary