ChatPaper.aiChatPaper

AdaptiVocab:通过轻量级词汇适配提升大语言模型在特定领域的效率

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

March 25, 2025
作者: Itay Nakash, Nitay Calderon, Eyal Ben David, Elad Hoffer, Roi Reichart
cs.AI

摘要

大型语言模型(LLMs)已展现出作为通用模型的卓越多功能性。然而,其广泛适用性伴随着高昂的计算开销,特别是在自回归解码过程中,每一步都需要进行一次前向传播。在特定领域场景下,通用能力并非必需,可以换取效率的提升。本研究中,我们采用了一种新颖的领域适应视角,通过调整词汇表以适应聚焦的领域,从而降低延迟和计算成本。我们提出了AdaptiVocab,一种端到端的词汇适应方法,旨在提升LLMs在低资源领域中的效率。AdaptiVocab可应用于任何分词器和架构,通过用基于领域特定n-gram的词汇替换原有词汇,减少输入处理和输出生成所需的词汇数量。AdaptiVocab采用现有嵌入的指数加权组合来初始化新的n-词汇嵌入,并实施轻量级的微调阶段,该阶段可在单GPU上高效完成。我们评估了两个7B规模的LLMs在三个细分领域中的效率、生成质量及最终任务表现。结果表明,AdaptiVocab在不影响性能的前提下,减少了超过25%的词汇使用量。
English
Large Language Models (LLMs) have shown impressive versatility as general purpose models. However, their broad applicability comes at a high-cost computational overhead, particularly in auto-regressive decoding where each step requires a forward pass. In domain-specific settings, general-purpose capabilities are unnecessary and can be exchanged for efficiency. In this work, we take a novel perspective on domain adaptation, reducing latency and computational costs by adapting the vocabulary to focused domains of interest. We introduce AdaptiVocab, an end-to-end approach for vocabulary adaptation, designed to enhance LLM efficiency in low-resource domains. AdaptiVocab can be applied to any tokenizer and architecture, modifying the vocabulary by replacing tokens with domain-specific n-gram-based tokens, thereby reducing the number of tokens required for both input processing and output generation. AdaptiVocab initializes new n-token embeddings using an exponentially weighted combination of existing embeddings and employs a lightweight fine-tuning phase that can be efficiently performed on a single GPU. We evaluate two 7B LLMs across three niche domains, assessing efficiency, generation quality, and end-task performance. Our results show that AdaptiVocab reduces token usage by over 25% without compromising performance

Summary

AI-Generated Summary

PDF752March 31, 2025