训练噪声标记修剪
Training Noise Token Pruning
November 27, 2024
作者: Mingxing Rao, Bohan Jiang, Daniel Moyer
cs.AI
摘要
在本研究中,我们提出了用于视觉Transformer的训练噪声标记(TNT)剪枝方法。我们的方法将离散标记丢弃条件放宽为连续的加性噪声,在训练中提供平滑优化,同时在部署设置中保留离散丢弃的计算优势。我们提供了与速率失真文献的理论联系,并在ImageNet数据集上使用ViT和DeiT架构进行了实证评估,展示了TNT相对于先前剪枝方法的优势。
English
In the present work we present Training Noise Token (TNT) Pruning for vision
transformers. Our method relaxes the discrete token dropping condition to
continuous additive noise, providing smooth optimization in training, while
retaining discrete dropping computational gains in deployment settings. We
provide theoretical connections to Rate-Distortion literature, and empirical
evaluations on the ImageNet dataset using ViT and DeiT architectures
demonstrating TNT's advantages over previous pruning methods.Summary
AI-Generated Summary