I-Con:表征学习的统一框架
I-Con: A Unifying Framework for Representation Learning
April 23, 2025
作者: Shaden Alshammari, John Hershey, Axel Feldmann, William T. Freeman, Mark Hamilton
cs.AI
摘要
随着表征学习领域的不断发展,为解决各类问题而设计的损失函数层出不穷。我们提出了一种基于信息论的统一方程,该方程概括了机器学习中大量现代损失函数。具体而言,我们引入了一个框架,展示了几大类机器学习方法实质上是在最小化两个条件分布——监督信号与学习到的表征——之间的综合KL散度。这一视角揭示了聚类、谱方法、降维、对比学习及监督学习背后隐藏的信息几何结构。该框架通过整合文献中成功的技巧,促进了新损失函数的开发。我们不仅提供了广泛的证明,将超过23种不同方法联系起来,还利用这些理论成果构建了当前最先进的无监督图像分类器,在ImageNet-1K的无监督分类任务上实现了较之前最佳水平8%的提升。此外,我们还展示了I-Con可用于推导出有原则的去偏方法,从而改进对比表征学习器的性能。
English
As the field of representation learning grows, there has been a proliferation
of different loss functions to solve different classes of problems. We
introduce a single information-theoretic equation that generalizes a large
collection of modern loss functions in machine learning. In particular, we
introduce a framework that shows that several broad classes of machine learning
methods are precisely minimizing an integrated KL divergence between two
conditional distributions: the supervisory and learned representations. This
viewpoint exposes a hidden information geometry underlying clustering, spectral
methods, dimensionality reduction, contrastive learning, and supervised
learning. This framework enables the development of new loss functions by
combining successful techniques from across the literature. We not only present
a wide array of proofs, connecting over 23 different approaches, but we also
leverage these theoretical results to create state-of-the-art unsupervised
image classifiers that achieve a +8% improvement over the prior
state-of-the-art on unsupervised classification on ImageNet-1K. We also
demonstrate that I-Con can be used to derive principled debiasing methods which
improve contrastive representation learners.Summary
AI-Generated Summary