Atla Selene Mini:通用评估模型
Atla Selene Mini: A General Purpose Evaluation Model
January 27, 2025
作者: Andrei Alexandru, Antonia Calvi, Henry Broomfield, Jackson Golden, Kyle Dai, Mathias Leys, Maurice Burger, Max Bartolo, Roman Engeler, Sashank Pisupati, Toby Drane, Young Sun Park
cs.AI
摘要
我们介绍了Atla Selene Mini,一种最先进的小型语言模型评判器(SLMJ)。Selene Mini是一种通用评估器,在跨越11个超出分布范围的基准测试中,包括绝对评分、分类和成对偏好任务,表现优于最佳的SLMJ和GPT-4o-mini。它是RewardBench上得分最高的8B生成模型,超过了像GPT-4o和专门评判器这样的强基准。为了实现这一目标,我们开发了一种原则性的数据筛选策略,通过合成生成的评论增强公共数据集,并通过过滤和数据集消融确保高质量。我们使用结合了直接偏好优化(DPO)和监督微调(SFT)损失的训练模型,并产生了一个在现实场景中表现出色的高度可提示的评估器。Selene Mini在金融和医疗行业数据集上与人类专家评估的零样本一致性显著提高。它还对提示格式的变化具有鲁棒性。初步结果表明,Selene Mini在一个实时、社区驱动的评判竞技场中是排名最高的评估器。我们在HuggingFace(https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B)和Ollama上发布了模型权重,以鼓励广泛的社区采用。
English
We introduce Atla Selene Mini, a state-of-the-art small language
model-as-a-judge (SLMJ). Selene Mini is a general-purpose evaluator that
outperforms the best SLMJs and GPT-4o-mini on overall performance across 11
out-of-distribution benchmarks, spanning absolute scoring, classification, and
pairwise preference tasks. It is the highest-scoring 8B generative model on
RewardBench, surpassing strong baselines like GPT-4o and specialized judges. To
achieve this, we develop a principled data curation strategy that augments
public datasets with synthetically generated critiques and ensures high quality
through filtering and dataset ablations. We train our model on a combined
direct preference optimization (DPO) and supervised fine-tuning (SFT) loss, and
produce a highly promptable evaluator that excels in real-world scenarios.
Selene Mini shows dramatically improved zero-shot agreement with human expert
evaluations on financial and medical industry datasets. It is also robust to
variations in prompt format. Preliminary results indicate that Selene Mini is
the top-ranking evaluator in a live, community-driven Judge Arena. We release
the model weights on HuggingFace
(https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B) and Ollama to encourage
widespread community adoption.Summary
AI-Generated Summary