Top-nσ:並非所有的 logits 都是你所需的。
Top-nσ: Not All Logits Are You Need
November 12, 2024
作者: Chenxia Tang, Jianchun Liu, Hongli Xu, Liusheng Huang
cs.AI
摘要
大型語言模型(LLMs)通常在推理任務中採用貪婪解碼或低溫度抽樣,反映了多樣性和準確性之間的權衡。我們通過引入top-nsigma挑戰這種慣例,這是一種新穎的抽樣方法,直接在預軟最大值之前的logits上操作,利用統計閾值。我們的關鍵見解是logits自然地分為高斯分佈的噪聲區域和獨特的信息區域,從而實現有效的標記篩選,而無需複雜的概率操作。與現有方法(例如top-p、min-p)不同,這些方法在較高溫度下無意中包含更多噪聲標記,top-nsigma無論溫度如何縮放,都能保持穩定的抽樣空間。我們還對top-nsigma進行了理論分析,以更好地理解其行為。在四個以推理為重點的數據集上進行的廣泛實驗結果表明,我們的方法不僅優於現有的抽樣方法,而且超越了貪婪解碼,並且即使在高溫度下也能保持一致的性能。
English
Large language models (LLMs) typically employ greedy decoding or
low-temperature sampling for reasoning tasks, reflecting a perceived trade-off
between diversity and accuracy. We challenge this convention by introducing
top-nsigma, a novel sampling method that operates directly on pre-softmax
logits by leveraging a statistical threshold. Our key insight is that logits
naturally separate into a Gaussian-distributed noisy region and a distinct
informative region, enabling efficient token filtering without complex
probability manipulations. Unlike existing methods (e.g., top-p, min-p)
that inadvertently include more noise tokens at higher temperatures,
top-nsigma maintains a stable sampling space regardless of temperature
scaling. We also provide a theoretical analysis of top-nsigma to better
understand its behavior. The extensive experimental results across four
reasoning-focused datasets demonstrate that our method not only outperforms
existing sampling approaches but also surpasses greedy decoding, while
maintaining consistent performance even at high temperatures.Summary
AI-Generated Summary