Atla Selene Mini：通用评估模型

摘要

我们介绍了Atla Selene Mini，一种最先进的小型语言模型评判器（SLMJ）。Selene Mini是一种通用评估器，在跨越11个超出分布范围的基准测试中，包括绝对评分、分类和成对偏好任务，表现优于最佳的SLMJ和GPT-4o-mini。它是RewardBench上得分最高的8B生成模型，超过了像GPT-4o和专门评判器这样的强基准。为了实现这一目标，我们开发了一种原则性的数据筛选策略，通过合成生成的评论增强公共数据集，并通过过滤和数据集消融确保高质量。我们使用结合了直接偏好优化（DPO）和监督微调（SFT）损失的训练模型，并产生了一个在现实场景中表现出色的高度可提示的评估器。Selene Mini在金融和医疗行业数据集上与人类专家评估的零样本一致性显著提高。它还对提示格式的变化具有鲁棒性。初步结果表明，Selene Mini在一个实时、社区驱动的评判竞技场中是排名最高的评估器。我们在HuggingFace（https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B）和Ollama上发布了模型权重，以鼓励广泛的社区采用。

English

We introduce Atla Selene Mini, a state-of-the-art small language model-as-a-judge (SLMJ). Selene Mini is a general-purpose evaluator that outperforms the best SLMJs and GPT-4o-mini on overall performance across 11 out-of-distribution benchmarks, spanning absolute scoring, classification, and pairwise preference tasks. It is the highest-scoring 8B generative model on RewardBench, surpassing strong baselines like GPT-4o and specialized judges. To achieve this, we develop a principled data curation strategy that augments public datasets with synthetically generated critiques and ensures high quality through filtering and dataset ablations. We train our model on a combined direct preference optimization (DPO) and supervised fine-tuning (SFT) loss, and produce a highly promptable evaluator that excels in real-world scenarios. Selene Mini shows dramatically improved zero-shot agreement with human expert evaluations on financial and medical industry datasets. It is also robust to variations in prompt format. Preliminary results indicate that Selene Mini is the top-ranking evaluator in a live, community-driven Judge Arena. We release the model weights on HuggingFace (https://hf.co/AtlaAI/Selene-1-Mini-Llama-3.1-8B) and Ollama to encourage widespread community adoption.

Atla Selene Mini：通用评估模型

Atla Selene Mini: A General Purpose Evaluation Model

摘要

Summary

Support