DELIFT: 데이터 효율적 언어 모델 지시어 미세 조정

초록

대형 언어 모델(LLMs)의 세밀 조정은 특정 작업에서의 성능을 향상시키는 데 중요하지만 종종 중복되거나 정보가 없는 데이터로 인해 리소스를 많이 소비합니다. 이 효율성을 해결하기 위해 우리는 DELIFT (Data Efficient Language model Instruction Fine-Tuning)라는 혁신적인 알고리즘을 소개합니다. 이 알고리즘은 세밀 조정의 세 가지 주요 단계(1) 지시 조정, (2) 작업별 세밀 조정(예: 추론, 질의응답), (3) 지속적인 세밀 조정(예: 새로운 데이터 버전 통합)에서 데이터 선택을 체계적으로 최적화합니다. 기존 방법과 달리 단일 단계 최적화에 초점을 맞추거나 계산 집약적인 그래디언트 계산에 의존하는 대신, DELIFT는 모든 단계에서 효율적으로 작동합니다. 우리 방법의 핵심은 데이터 샘플이 모델의 현재 능력에 상대적으로 정보적 가치를 측정하여 다른 샘플에 대한 모델의 응답을 개선하는 데 얼마나 유익한지를 측정하는 쌍별 유틸리티 메트릭입니다. 이 메트릭에 적용된 다양한 서브모듈러 함수를 활용하여 DELIFT는 세밀 조정의 모든 단계에서 유용한 다양하고 최적의 하위 집합을 선택합니다. 다양한 작업 및 모델 규모에서 수행된 실험 결과 DELIFT가 성능을 저하시키지 않으면서 세밀 조정 데이터 크기를 최대 70% 줄일 수 있으며 상당한 계산 절약을 제공하고 효율성과 효과성 측면에서 기존 방법을 능가한다는 것을 보여줍니다.

English

Fine-tuning large language models (LLMs) is essential for enhancing their performance on specific tasks but is often resource-intensive due to redundant or uninformative data. To address this inefficiency, we introduce DELIFT (Data Efficient Language model Instruction Fine-Tuning), a novel algorithm that systematically optimizes data selection across the three key stages of fine-tuning: (1) instruction tuning, (2) task-specific fine-tuning (e.g., reasoning, question-answering), and (3) continual fine-tuning (e.g., incorporating new data versions). Unlike existing methods that focus on single-stage optimization or rely on computationally intensive gradient calculations, DELIFT operates efficiently across all stages. Central to our approach is a pairwise utility metric that quantifies how beneficial a data sample is for improving the model's responses to other samples, effectively measuring the informational value relative to the model's current capabilities. By leveraging different submodular functions applied to this metric, DELIFT selects diverse and optimal subsets that are useful across all stages of fine-tuning. Experiments across various tasks and model scales demonstrate that DELIFT can reduce the fine-tuning data size by up to 70% without compromising performance, offering significant computational savings and outperforming existing methods in both efficiency and efficacy.

DELIFT: 데이터 효율적 언어 모델 지시어 미세 조정

DELIFT: Data Efficient Language model Instruction Fine Tuning

초록

Support