SAFE-SQL:自我增强上下文学习与细粒度示例选择,用于文本到SQL。
SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL
February 17, 2025
作者: Jimin Lee, Ingeol Baek, Byeongjeong Kim, Hwanhee Lee
cs.AI
摘要
文本转SQL旨在将自然语言问题转换为可执行的SQL查询。尽管先前的方法,如骨架掩码选择,通过检索类似的训练示例来指导大型语言模型(LLMs)已经表现出色,但在现实场景中,这些示例不可用时,它们表现不佳。为了克服这一限制,我们提出了一种名为自我增强上下文学习与细粒度示例选择的文本转SQL(SAFE-SQL)的新框架,通过生成和过滤自我增强示例来改善SQL生成。SAFE-SQL首先提示LLM生成多个与测试输入相关的文本转SQL示例。然后,SAFE-SQL通过三个相关性评估筛选这些示例,构建高质量的上下文学习示例。使用自动生成的示例,SAFE-SQL超越了先前的零-shot和少-shot文本转SQL框架,实现了更高的执行准确性。值得注意的是,我们的方法在额外困难和未知场景中提供了额外的性能增益,而传统方法通常失败。
English
Text-to-SQL aims to convert natural language questions into executable SQL
queries. While previous approaches, such as skeleton-masked selection, have
demonstrated strong performance by retrieving similar training examples to
guide large language models (LLMs), they struggle in real-world scenarios where
such examples are unavailable. To overcome this limitation, we propose
Self-Augmentation in-context learning with Fine-grained Example selection for
Text-to-SQL (SAFE-SQL), a novel framework that improves SQL generation by
generating and filtering self-augmented examples. SAFE-SQL first prompts an LLM
to generate multiple Text-to-SQL examples relevant to the test input. Then
SAFE-SQL filters these examples through three relevance assessments,
constructing high-quality in-context learning examples. Using self-generated
examples, SAFE-SQL surpasses the previous zero-shot, and few-shot Text-to-SQL
frameworks, achieving higher execution accuracy. Notably, our approach provides
additional performance gains in extra hard and unseen scenarios, where
conventional methods often fail.Summary
AI-Generated Summary