最优脑细胞凋亡

摘要

卷积神经网络（CNNs）与Transformer模型日益增长的复杂性和参数量，在计算效率和资源需求方面带来了挑战。剪枝作为一种有效策略，通过移除冗余元素如神经元、通道或连接，能在不明显影响性能的前提下提升计算效率，已被证实能有效应对这些挑战。本文在“最优脑损伤”（Optimal Brain Damage, OBD）的基础之上，进一步发展了利用Hessian矩阵进行参数重要性评估的方法。不同于以往依赖近似的方法，我们提出了“最优脑凋亡”（Optimal Brain Apoptosis, OBA），这是一种新颖的剪枝方法，直接计算每个参数的Hessian-向量积值。通过跨网络层分解Hessian矩阵，并识别层间Hessian子矩阵非零的条件，我们提出了一种高效计算参数二阶泰勒展开的技术。该方法使得剪枝过程更为精确，特别是在CNNs和Transformer的应用中，这一点在我们的实验中得到了验证，包括在CIFAR10、CIFAR100和Imagenet数据集上对VGG19、ResNet32、ResNet50及ViT-B/16模型的测试。我们的代码已公开于https://github.com/NEU-REAL/OBA。

English

The increasing complexity and parameter count of Convolutional Neural Networks (CNNs) and Transformers pose challenges in terms of computational efficiency and resource demands. Pruning has been identified as an effective strategy to address these challenges by removing redundant elements such as neurons, channels, or connections, thereby enhancing computational efficiency without heavily compromising performance. This paper builds on the foundational work of Optimal Brain Damage (OBD) by advancing the methodology of parameter importance estimation using the Hessian matrix. Unlike previous approaches that rely on approximations, we introduce Optimal Brain Apoptosis (OBA), a novel pruning method that calculates the Hessian-vector product value directly for each parameter. By decomposing the Hessian matrix across network layers and identifying conditions under which inter-layer Hessian submatrices are non-zero, we propose a highly efficient technique for computing the second-order Taylor expansion of parameters. This approach allows for a more precise pruning process, particularly in the context of CNNs and Transformers, as validated in our experiments including VGG19, ResNet32, ResNet50, and ViT-B/16 on CIFAR10, CIFAR100 and Imagenet datasets. Our code is available at https://github.com/NEU-REAL/OBA.

最优脑细胞凋亡

Optimal Brain Apoptosis

摘要

Summary

Support

Support