通過知識增強提示探索大型語言模型解決比例類比問題的能力
Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting
December 1, 2024
作者: Thilini Wijesiriwardene, Ruwan Wickramarachchi, Sreeram Vennam, Vinija Jain, Aman Chadha, Amitava Das, Ponnurangam Kumaraguru, Amit Sheth
cs.AI
摘要
建立類比是認知的基礎。比例類比通常由四個術語組成,常被用來評估語言和認知能力。例如,完成類比問題如“氧氣對應氣體,如同<空白>對應<空白>”需要識別第一對術語(如“氧氣”和“氣體”)之間的語義關係(例如“屬於”),並找到另一對具有相同關係的術語(例如“鋁”和“金屬”)。在這項研究中,我們引入了一個包含15K個多重選擇問答(MCQA)題目的數據集,用於完成比例類比,並評估當代大型語言模型(LLMs)在各種知識增強提示設置下的表現。具體而言,我們將提示與三種類型的知識相結合:範例、結構化和有針對性的。我們的結果顯示,儘管有大量的訓練數據,解決比例類比對當前的LLMs仍然具有挑戰性,最佳模型的準確率為55%。值得注意的是,我們發現提供有針對性的知識可以更好地幫助模型完成比例類比,相較於提供範例或結構化知識的集合。
English
Making analogies is fundamental to cognition. Proportional analogies, which
consist of four terms, are often used to assess linguistic and cognitive
abilities. For instance, completing analogies like "Oxygen is to Gas as <blank>
is to <blank>" requires identifying the semantic relationship (e.g., "type of")
between the first pair of terms ("Oxygen" and "Gas") and finding a second pair
that shares the same relationship (e.g., "Aluminum" and "Metal"). In this work,
we introduce a 15K Multiple-Choice Question Answering (MCQA) dataset for
proportional analogy completion and evaluate the performance of contemporary
Large Language Models (LLMs) in various knowledge-enhanced prompt settings.
Specifically, we augment prompts with three types of knowledge: exemplar,
structured, and targeted. Our results show that despite extensive training
data, solving proportional analogies remains challenging for current LLMs, with
the best model achieving an accuracy of 55%. Notably, we find that providing
targeted knowledge can better assist models in completing proportional
analogies compared to providing exemplars or collections of structured
knowledge.Summary
AI-Generated Summary