ChatPaper.aiChatPaper

標尺:一種與模型無關的方法,用於控制大型語言模型生成的長度

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

September 27, 2024
作者: Jiaming Li, Lei Zhang, Yunshui Li, Ziqiang Liu, yuelin bai, Run Luo, Longze Chen, Min Yang
cs.AI

摘要

大型語言模型的指令遵循能力使人類能夠以自然的方式與人工智慧代理互動。然而,當需要生成特定長度的回應時,由於其固有的困難在準確感知數值約束方面,大型語言模型通常難以滿足用戶的需求。為了探索大型語言模型控制生成回應長度的能力,我們提出了目標長度生成任務(TLG),並設計了兩個指標,即精確匹配(PM)和靈活匹配(FM),來評估模型在遵循指定回應長度方面的表現。此外,我們引入了一種新穎的、與模型無關的方法稱為 Ruler,該方法利用元長度標記(MLTs)來增強大型語言模型在受到長度約束指令下的指令遵循能力。具體來說,Ruler使LLMs能夠根據指令中的長度約束生成指定長度的回應。此外,當長度約束未明確提供時,Ruler可以自動生成適當的MLT,展示出卓越的通用性和泛化能力。全面的實驗顯示了 Ruler 在不同LLMs上的目標長度生成任務中的有效性,例如在所有級別上PM平均增益為27.97,FM平均增益為29.57。此外,我們進行了廣泛的消融實驗,以進一步證實 Ruler 的功效和泛化能力。我們的代碼和數據可在 https://github.com/Geaming2002/Ruler 上找到。
English
The instruction-following ability of large language models enables humans to interact with AI agents in a natural way. However, when required to generate responses of a specific length, large language models often struggle to meet users' needs due to their inherent difficulty in accurately perceiving numerical constraints. To explore the ability of large language models to control the length of generated responses, we propose the Target Length Generation Task (TLG) and design two metrics, Precise Match (PM) and Flexible Match (FM) to evaluate the model's performance in adhering to specified response lengths. Furthermore, we introduce a novel, model-agnostic approach called Ruler, which employs Meta Length Tokens (MLTs) to enhance the instruction-following ability of large language models under length-constrained instructions. Specifically, Ruler equips LLMs with the ability to generate responses of a specified length based on length constraints within the instructions. Moreover, Ruler can automatically generate appropriate MLT when length constraints are not explicitly provided, demonstrating excellent versatility and generalization. Comprehensive experiments show the effectiveness of Ruler across different LLMs on Target Length Generation Task, e.g., at All Level 27.97 average gain on PM, 29.57 average gain on FM. In addition, we conduct extensive ablation experiments to further substantiate the efficacy and generalization of Ruler. Our code and data is available at https://github.com/Geaming2002/Ruler.

Summary

AI-Generated Summary

PDF302November 13, 2024