I-Con:表示學習的統一框架
I-Con: A Unifying Framework for Representation Learning
April 23, 2025
作者: Shaden Alshammari, John Hershey, Axel Feldmann, William T. Freeman, Mark Hamilton
cs.AI
摘要
隨著表徵學習領域的發展,針對不同問題類型的損失函數層出不窮。我們提出了一個單一的資訊理論方程,該方程概括了機器學習中大量現代損失函數。具體而言,我們引入了一個框架,展示了多種廣泛的機器學習方法實際上是在最小化兩個條件分佈之間的積分KL散度:監督表徵與學習表徵。這一視角揭示了聚類、譜方法、降維、對比學習以及監督學習背後隱藏的資訊幾何結構。此框架通過結合文獻中成功的技術,促進了新損失函數的開發。我們不僅提供了廣泛的證明,連接了超過23種不同的方法,還利用這些理論成果創建了最先進的無監督圖像分類器,在ImageNet-1K上的無監督分類任務中相比之前的最佳結果提升了8%。此外,我們還展示了I-Con可用於推導出有原則的去偏方法,從而改進了對比表徵學習器。
English
As the field of representation learning grows, there has been a proliferation
of different loss functions to solve different classes of problems. We
introduce a single information-theoretic equation that generalizes a large
collection of modern loss functions in machine learning. In particular, we
introduce a framework that shows that several broad classes of machine learning
methods are precisely minimizing an integrated KL divergence between two
conditional distributions: the supervisory and learned representations. This
viewpoint exposes a hidden information geometry underlying clustering, spectral
methods, dimensionality reduction, contrastive learning, and supervised
learning. This framework enables the development of new loss functions by
combining successful techniques from across the literature. We not only present
a wide array of proofs, connecting over 23 different approaches, but we also
leverage these theoretical results to create state-of-the-art unsupervised
image classifiers that achieve a +8% improvement over the prior
state-of-the-art on unsupervised classification on ImageNet-1K. We also
demonstrate that I-Con can be used to derive principled debiasing methods which
improve contrastive representation learners.Summary
AI-Generated Summary