ChatPaper.aiChatPaper

将长上下文大语言模型的研究重心从输入转向输出

Shifting Long-Context LLMs Research from Input to Output

March 6, 2025
作者: Yuhao Wu, Yushi Bai, Zhiqing Hu, Shangqing Tu, Ming Shan Hee, Juanzi Li, Roy Ka-Wei Lee
cs.AI

摘要

近期,长上下文大语言模型(LLMs)的进展主要集中在处理扩展输入上下文方面,从而在长上下文理解上取得了显著进步。然而,生成长篇输出这一同样关键的领域却相对较少受到关注。本文主张在自然语言处理(NLP)研究中实现范式转变,以应对长输出生成的挑战。诸如小说创作、长期规划和复杂推理等任务,要求模型不仅理解广泛上下文,还需生成连贯、内容丰富且逻辑一致的长篇文本。这些需求凸显了当前LLM能力中的一个关键空白。我们强调这一尚未充分探索的领域的重要性,并呼吁集中力量开发专门用于生成高质量长篇输出的基础LLMs,这在实际应用中具有巨大的潜力。
English
Recent advancements in long-context Large Language Models (LLMs) have primarily concentrated on processing extended input contexts, resulting in significant strides in long-context comprehension. However, the equally critical aspect of generating long-form outputs has received comparatively less attention. This paper advocates for a paradigm shift in NLP research toward addressing the challenges of long-output generation. Tasks such as novel writing, long-term planning, and complex reasoning require models to understand extensive contexts and produce coherent, contextually rich, and logically consistent extended text. These demands highlight a critical gap in current LLM capabilities. We underscore the importance of this under-explored domain and call for focused efforts to develop foundational LLMs tailored for generating high-quality, long-form outputs, which hold immense potential for real-world applications.

Summary

AI-Generated Summary

PDF131March 14, 2025