評估代理:視覺生成模型的高效且可提示評估框架
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models
December 10, 2024
作者: Fan Zhang, Shulin Tian, Ziqi Huang, Yu Qiao, Ziwei Liu
cs.AI
摘要
最近在視覺生成模型方面的進展已經實現高質量的圖像和視頻生成,開啟了多樣的應用。然而,評估這些模型通常需要對數百甚至數千幅圖像或視頻進行抽樣,使評估過程在擴散型模型中變得計算密集,因為這些模型的抽樣速度本質上較慢。此外,現有的評估方法依賴於僵化的流程,忽略了特定用戶需求,提供缺乏清晰解釋的數值結果。相比之下,人類可以通過觀察少量樣本快速形成對模型能力的印象。為了模仿這一點,我們提出了評估代理框架,該框架採用類似人類的策略,實現高效、動態、多輪評估,每輪僅使用少量樣本,同時提供詳細的、針對用戶需求的分析。它具有四個主要優勢:1)高效性,2)可根據不同用戶需求進行提示的評估,3)提供超出單一數值分數的可解釋性,4)在各種模型和工具之間實現可擴展性。實驗表明,評估代理可以將評估時間縮短到傳統方法的10%,同時提供可比較的結果。評估代理框架已完全開源,以推動視覺生成模型及其高效評估的研究。
English
Recent advancements in visual generative models have enabled high-quality
image and video generation, opening diverse applications. However, evaluating
these models often demands sampling hundreds or thousands of images or videos,
making the process computationally expensive, especially for diffusion-based
models with inherently slow sampling. Moreover, existing evaluation methods
rely on rigid pipelines that overlook specific user needs and provide numerical
results without clear explanations. In contrast, humans can quickly form
impressions of a model's capabilities by observing only a few samples. To mimic
this, we propose the Evaluation Agent framework, which employs human-like
strategies for efficient, dynamic, multi-round evaluations using only a few
samples per round, while offering detailed, user-tailored analyses. It offers
four key advantages: 1) efficiency, 2) promptable evaluation tailored to
diverse user needs, 3) explainability beyond single numerical scores, and 4)
scalability across various models and tools. Experiments show that Evaluation
Agent reduces evaluation time to 10% of traditional methods while delivering
comparable results. The Evaluation Agent framework is fully open-sourced to
advance research in visual generative models and their efficient evaluation.Summary
AI-Generated Summary