지배자: 대규모 언어 모델을 위한 생성 길이를 제어하기 위한 모델에 중립적인 방법

초록

대형 언어 모델의 명령 따르기 능력은 인간이 자연스럽게 AI 에이전트와 상호 작용할 수 있게 합니다. 그러나 특정 길이의 응답을 생성해야 할 때, 대형 언어 모델은 종종 숫자 제약을 정확하게 인식하는 데 어려움을 겪어 사용자의 요구를 충족시키기 어렵습니다. 대형 언어 모델이 생성된 응답의 길이를 제어하는 능력을 탐구하기 위해 우리는 목표 길이 생성 작업 (TLG)을 제안하고 Precise Match (PM) 및 Flexible Match (FM) 두 가지 메트릭을 설계하여 모델이 지정된 응답 길이를 준수하는 성능을 평가합니다. 더 나아가, 우리는 Ruler라는 새로운, 모델에 독립적인 접근 방식을 소개합니다. 이 방법은 Meta Length Tokens (MLTs)를 활용하여 길이 제약이 있는 명령 아래에서 대형 언어 모델의 명령 따르기 능력을 향상시킵니다. 구체적으로, Ruler는 LLMs에게 명령 내 길이 제약을 기반으로 지정된 길이의 응답을 생성할 수 있는 능력을 제공합니다. 또한, Ruler는 길이 제약이 명시적으로 제공되지 않을 때 자동으로 적절한 MLT를 생성할 수 있어 뛰어난 다재다능성과 일반화 능력을 보여줍니다. 포괄적인 실험은 Ruler의 효과를 보여주며, 다양한 LLMs에서 Target Length Generation Task에 대해 All Level에서 PM에서 평균 27.97의 이득, FM에서 평균 29.57의 이득을 얻었습니다. 더불어, Ruler의 효과와 일반화 능력을 더 확실히 입증하기 위해 포괄적인 제거 실험을 수행했습니다. 우리의 코드와 데이터는 https://github.com/Geaming2002/Ruler에서 확인할 수 있습니다.

English

The instruction-following ability of large language models enables humans to interact with AI agents in a natural way. However, when required to generate responses of a specific length, large language models often struggle to meet users' needs due to their inherent difficulty in accurately perceiving numerical constraints. To explore the ability of large language models to control the length of generated responses, we propose the Target Length Generation Task (TLG) and design two metrics, Precise Match (PM) and Flexible Match (FM) to evaluate the model's performance in adhering to specified response lengths. Furthermore, we introduce a novel, model-agnostic approach called Ruler, which employs Meta Length Tokens (MLTs) to enhance the instruction-following ability of large language models under length-constrained instructions. Specifically, Ruler equips LLMs with the ability to generate responses of a specified length based on length constraints within the instructions. Moreover, Ruler can automatically generate appropriate MLT when length constraints are not explicitly provided, demonstrating excellent versatility and generalization. Comprehensive experiments show the effectiveness of Ruler across different LLMs on Target Length Generation Task, e.g., at All Level 27.97 average gain on PM, 29.57 average gain on FM. In addition, we conduct extensive ablation experiments to further substantiate the efficacy and generalization of Ruler. Our code and data is available at https://github.com/Geaming2002/Ruler.

지배자: 대규모 언어 모델을 위한 생성 길이를 제어하기 위한 모델에 중립적인 방법

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

초록

Summary

Support

Support