达尔文LM:大语言模型的进化式结构化剪枝
DarwinLM: Evolutionary Structured Pruning of Large Language Models
February 11, 2025
作者: Shengkun Tang, Oliver Sieberling, Eldar Kurtic, Zhiqiang Shen, Dan Alistarh
cs.AI
摘要
大型语言模型(LLMs)在各类自然语言处理任务中取得了显著成功。然而,其庞大的计算成本限制了其广泛应用,特别是在实时应用场景中。结构化剪枝提供了一种有效的解决方案,通过压缩模型并直接带来端到端的速度提升,且不受硬件环境限制。同时,模型的不同组件对剪枝表现出不同的敏感性,这要求进行非均匀的模型压缩。然而,剪枝方法不仅需要识别出有效的子结构,还需考虑压缩后的训练过程。为此,我们提出了\sysname,一种训练感知的结构化剪枝方法。\sysname基于进化搜索过程,在每一代中通过变异生成多个子代模型,并选择最适应者存活。为了评估训练后的效果,我们在子代群体中引入了一个轻量级的多步训练过程,逐步增加训练数据量,并在每个选择阶段淘汰表现不佳的模型。我们通过在Llama-2-7B、Llama-3.1-8B和Qwen-2.5-14B-Instruct上的广泛实验验证了该方法,实现了结构化剪枝的最先进性能。例如,\sysname在压缩后训练阶段所需训练数据量仅为ShearedLlama的五分之一,同时性能更优。
English
Large Language Models (LLMs) have achieved significant success across various
NLP tasks. However, their massive computational costs limit their widespread
use, particularly in real-time applications. Structured pruning offers an
effective solution by compressing models and directly providing end-to-end
speed improvements, regardless of the hardware environment. Meanwhile,
different components of the model exhibit varying sensitivities towards
pruning, calling for non-uniform model compression. However, a pruning
method should not only identify a capable substructure, but also account for
post-compression training. To this end, we propose \sysname, a method for
training-aware structured pruning. \sysname builds upon an evolutionary
search process, generating multiple offspring models in each generation through
mutation, and selecting the fittest for survival. To assess the effect of
post-training, we incorporate a lightweight, multistep training process within
the offspring population, progressively increasing the number of tokens and
eliminating poorly performing models in each selection stage. We validate our
method through extensive experiments on Llama-2-7B, Llama-3.1-8B and
Qwen-2.5-14B-Instruct, achieving state-of-the-art performance for structured
pruning. For instance, \sysname surpasses ShearedLlama while requiring
5times less training data during post-compression training.Summary
AI-Generated Summary