ChatPaper.aiChatPaper

通过试错法评估智能

Evaluating Intelligence via Trial and Error

February 26, 2025
作者: Jingtao Zhan, Jiahao Zhao, Jiayu Li, Yiqun Liu, Bo Zhang, Qingyao Ai, Jiaxin Mao, Hongning Wang, Min Zhang, Shaoping Ma
cs.AI

摘要

智能是物种在有限次试错中寻找解决方案的关键特质。基于这一理念,我们引入了“生存游戏”作为评估智能的框架,该框架以试错过程中的失败次数为衡量标准。失败次数越少,表明智能水平越高。当失败次数的期望值和方差均为有限值时,这标志着系统能够持续找到应对新挑战的解决方案,我们将其定义为“自主智能水平”。通过“生存游戏”,我们对现有AI系统进行了全面评估。结果显示,尽管AI系统在简单任务中达到了自主智能水平,但在视觉、搜索、推荐和语言等更复杂的任务中,它们仍远未达标。虽然扩展当前AI技术可能有所帮助,但这将带来天文数字般的成本。预测表明,实现通用任务的自主智能水平需要10^{26}个参数。为了直观理解这一规模,加载如此庞大的模型所需的H100 GPU总量,其总价值是苹果公司市值的10^{7}倍。即便遵循摩尔定律,支持如此规模的参数也需要70年。这一惊人成本凸显了人类任务的复杂性及当前AI技术的不足。为了深入探究这一现象,我们对“生存游戏”及其实验结果进行了理论分析。研究发现,人类任务具有临界性特征。因此,达到自主智能水平需要深刻理解任务的内在机制。然而,当前AI系统并未完全掌握这些机制,而是依赖于表面的模仿,这使得它们难以达到自主水平。我们相信,“生存游戏”不仅能指导AI的未来发展,还能为理解人类智能提供深刻的洞见。
English
Intelligence is a crucial trait for species to find solutions within a limited number of trial-and-error attempts. Building on this idea, we introduce Survival Game as a framework to evaluate intelligence based on the number of failed attempts in a trial-and-error process. Fewer failures indicate higher intelligence. When the expectation and variance of failure counts are both finite, it signals the ability to consistently find solutions to new challenges, which we define as the Autonomous Level of intelligence. Using Survival Game, we comprehensively evaluate existing AI systems. Our results show that while AI systems achieve the Autonomous Level in simple tasks, they are still far from it in more complex tasks, such as vision, search, recommendation, and language. While scaling current AI technologies might help, this would come at an astronomical cost. Projections suggest that achieving the Autonomous Level for general tasks would require 10^{26} parameters. To put this into perspective, loading such a massive model requires so many H100 GPUs that their total value is 10^{7} times that of Apple Inc.'s market value. Even with Moore's Law, supporting such a parameter scale would take 70 years. This staggering cost highlights the complexity of human tasks and the inadequacies of current AI technologies. To further investigate this phenomenon, we conduct a theoretical analysis of Survival Game and its experimental results. Our findings suggest that human tasks possess a criticality property. As a result, Autonomous Level requires a deep understanding of the task's underlying mechanisms. Current AI systems, however, do not fully grasp these mechanisms and instead rely on superficial mimicry, making it difficult for them to reach an autonomous level. We believe Survival Game can not only guide the future development of AI but also offer profound insights into human intelligence.

Summary

AI-Generated Summary

PDF43March 12, 2025