依据韩国教育标准评估多模态生成式人工智能

摘要

本文介绍了韩国国家教育测试基准（KoNET），这是一个旨在利用韩国国家教育考试评估多模态生成式人工智能系统的新基准。KoNET包含四项考试：韩国小学综合教育发展测试（KoEGED）、初中（KoMGED）、高中（KoHGED）以及大学修学能力测试（KoCSAT）。这些考试以其严格的标准和多样化的问题著称，有助于全面分析AI在不同教育水平上的表现。通过聚焦于韩语，KoNET为探索较少研究语言中的模型性能提供了洞见。我们评估了一系列模型——开源、开放访问和封闭API——通过考察难度、科目多样性及人类错误率。代码和数据集构建工具将完全开源，地址为https://github.com/naver-ai/KoNET。

English

This paper presents the Korean National Educational Test Benchmark (KoNET), a new benchmark designed to evaluate Multimodal Generative AI Systems using Korean national educational tests. KoNET comprises four exams: the Korean Elementary General Educational Development Test (KoEGED), Middle (KoMGED), High (KoHGED), and College Scholastic Ability Test (KoCSAT). These exams are renowned for their rigorous standards and diverse questions, facilitating a comprehensive analysis of AI performance across different educational levels. By focusing on Korean, KoNET provides insights into model performance in less-explored languages. We assess a range of models - open-source, open-access, and closed APIs - by examining difficulties, subject diversity, and human error rates. The code and dataset builder will be made fully open-sourced at https://github.com/naver-ai/KoNET.

依据韩国教育标准评估多模态生成式人工智能

Evaluating Multimodal Generative AI with Korean Educational Standards

摘要

Summary

Support