可行学习
Feasible Learning
January 24, 2025
作者: Juan Ramirez, Ignacio Hounie, Juan Elenter, Jose Gallego-Posada, Meraj Hashemizadeh, Alejandro Ribeiro, Simon Lacoste-Julien
cs.AI
摘要
我们引入了可行学习(Feasible Learning,FL)这一以样本为中心的学习范式,其中模型通过解决一个约束每个训练样本损失的可行性问题来进行训练。与普遍采用的经验风险最小化(Empirical Risk Minimization,ERM)框架相比,后者优化平均性能,而FL要求在每个单独数据点上表现出令人满意的性能。由于任何满足规定性能阈值的模型都是有效的FL解决方案,优化算法的选择及其动态在塑造最终解决方案的特性方面起着至关重要的作用。具体而言,我们研究了一种原始-对偶方法,该方法在训练过程中动态重新调整每个样本的重要性。为了解决在实践中设置有意义阈值的挑战,我们引入了FL的一种松弛形式,其中包含最小范数的松弛变量。我们的实证分析涵盖了图像分类、年龄回归以及大型语言模型中的偏好优化,结果表明通过FL训练的模型可以从数据中学习,同时相较于ERM,表现出改善的尾部行为,对平均性能仅有轻微影响。
English
We introduce Feasible Learning (FL), a sample-centric learning paradigm where
models are trained by solving a feasibility problem that bounds the loss for
each training sample. In contrast to the ubiquitous Empirical Risk Minimization
(ERM) framework, which optimizes for average performance, FL demands
satisfactory performance on every individual data point. Since any model that
meets the prescribed performance threshold is a valid FL solution, the choice
of optimization algorithm and its dynamics play a crucial role in shaping the
properties of the resulting solutions. In particular, we study a primal-dual
approach which dynamically re-weights the importance of each sample during
training. To address the challenge of setting a meaningful threshold in
practice, we introduce a relaxation of FL that incorporates slack variables of
minimal norm. Our empirical analysis, spanning image classification, age
regression, and preference optimization in large language models, demonstrates
that models trained via FL can learn from data while displaying improved tail
behavior compared to ERM, with only a marginal impact on average performance.Summary
AI-Generated Summary