GRS-QA -- 图推理结构化问答数据集
GRS-QA -- Graph Reasoning-Structured Question Answering Dataset
November 1, 2024
作者: Anish Pahilajani, Devasha Trivedi, Jincen Shuai, Khin S. Yone, Samyak Rajesh Jain, Namyong Park, Ryan A. Rossi, Nesreen K. Ahmed, Franck Dernoncourt, Yu Wang
cs.AI
摘要
大型语言模型(LLMs)在多跳问答(M-QA)中表现出色,这归因于其先进的推理能力。然而,固有推理结构对LLM M-QA性能的影响仍不清楚,主要是因为缺乏提供细粒度推理结构的问答数据集。为了填补这一空白,我们引入了图推理结构问答数据集(GRS-QA),该数据集为问答对提供了语义上下文和推理结构。与现有的M-QA数据集不同,其中不同推理结构交织在一起,GRS-QA通过构建推理图明确捕捉复杂的推理路径,其中节点代表文本上下文,边表示逻辑流动。这些不同结构的推理图使得能够对LLM在各种推理结构下的推理能力进行细粒度评估。我们的实证分析显示,LLMs在处理具有不同推理结构的问题时表现不同。这一发现促进了与语义相比对文本结构的探索。
English
Large Language Models (LLMs) have excelled in multi-hop question-answering
(M-QA) due to their advanced reasoning abilities. However, the impact of the
inherent reasoning structures on LLM M-QA performance remains unclear, largely
due to the absence of QA datasets that provide fine-grained reasoning
structures. To address this gap, we introduce the Graph Reasoning-Structured
Question Answering Dataset (GRS-QA), which includes both semantic contexts and
reasoning structures for QA pairs. Unlike existing M-QA datasets, where
different reasoning structures are entangled together, GRS-QA explicitly
captures intricate reasoning pathways by constructing reasoning graphs, where
nodes represent textual contexts and edges denote logical flows. These
reasoning graphs of different structures enable a fine-grained evaluation of
LLM reasoning capabilities across various reasoning structures. Our empirical
analysis reveals that LLMs perform differently when handling questions with
varying reasoning structures. This finding facilitates the exploration of
textual structures as compared with semantics.Summary
AI-Generated Summary