ChatPaper.aiChatPaper

使用耦合的Adam算法获得更好的嵌入向量

Better Embeddings with Coupled Adam

February 12, 2025
作者: Felix Stollenwerk, Tobias Stollenwerk
cs.AI

摘要

尽管LLM具有显著的能力,但其学习的词表示表现出令人不满且理解不足的各向异性特征。在本文中,我们认为Adam中的二阶矩是各向异性嵌入的原因,并建议使用一种名为Coupled Adam的修改优化器来缓解这一问题。我们的实验表明,Coupled Adam显著改善了嵌入的质量,同时也在足够大的数据集上带来更好的上游和下游性能。
English
Despite their remarkable capabilities, LLMs learn word representations that exhibit the undesirable yet poorly understood feature of anisotropy. In this paper, we argue that the second moment in Adam is a cause of anisotropic embeddings, and suggest a modified optimizer called Coupled Adam to mitigate the problem. Our experiments demonstrate that Coupled Adam significantly improves the quality of embeddings, while also leading to better upstream and downstream performance on large enough datasets.

Summary

AI-Generated Summary

PDF13February 18, 2025