最优脑细胞凋亡
Optimal Brain Apoptosis
February 25, 2025
作者: Mingyuan Sun, Zheng Fang, Jiaxu Wang, Junjie Jiang, Delei Kong, Chenming Hu, Yuetong Fang, Renjing Xu
cs.AI
摘要
卷积神经网络(CNNs)与Transformer模型日益增长的复杂性和参数量,在计算效率和资源需求方面带来了挑战。剪枝作为一种有效策略,通过移除冗余元素如神经元、通道或连接,能在不明显影响性能的前提下提升计算效率,已被证实能有效应对这些挑战。本文在“最优脑损伤”(Optimal Brain Damage, OBD)的基础之上,进一步发展了利用Hessian矩阵进行参数重要性评估的方法。不同于以往依赖近似的方法,我们提出了“最优脑凋亡”(Optimal Brain Apoptosis, OBA),这是一种新颖的剪枝方法,直接计算每个参数的Hessian-向量积值。通过跨网络层分解Hessian矩阵,并识别层间Hessian子矩阵非零的条件,我们提出了一种高效计算参数二阶泰勒展开的技术。该方法使得剪枝过程更为精确,特别是在CNNs和Transformer的应用中,这一点在我们的实验中得到了验证,包括在CIFAR10、CIFAR100和Imagenet数据集上对VGG19、ResNet32、ResNet50及ViT-B/16模型的测试。我们的代码已公开于https://github.com/NEU-REAL/OBA。
English
The increasing complexity and parameter count of Convolutional Neural
Networks (CNNs) and Transformers pose challenges in terms of computational
efficiency and resource demands. Pruning has been identified as an effective
strategy to address these challenges by removing redundant elements such as
neurons, channels, or connections, thereby enhancing computational efficiency
without heavily compromising performance. This paper builds on the foundational
work of Optimal Brain Damage (OBD) by advancing the methodology of parameter
importance estimation using the Hessian matrix. Unlike previous approaches that
rely on approximations, we introduce Optimal Brain Apoptosis (OBA), a novel
pruning method that calculates the Hessian-vector product value directly for
each parameter. By decomposing the Hessian matrix across network layers and
identifying conditions under which inter-layer Hessian submatrices are
non-zero, we propose a highly efficient technique for computing the
second-order Taylor expansion of parameters. This approach allows for a more
precise pruning process, particularly in the context of CNNs and Transformers,
as validated in our experiments including VGG19, ResNet32, ResNet50, and
ViT-B/16 on CIFAR10, CIFAR100 and Imagenet datasets. Our code is available at
https://github.com/NEU-REAL/OBA.Summary
AI-Generated Summary