MATATA：一種用於表格應用的弱監督式數學工具輔助推理

摘要

隨著工具增強的語言代理，數學推理能力正在增強，但方法通常依賴於封閉源碼或大型模型、外部數據，或大量提示工程。本研究介紹了MATATA，這是一種新穎且具成本效益的方法，用於訓練LLM代理以解決表格數據問題，通過推理、規劃和工具使用。採用漸進式自我改進範式和迭代式弱監督，賦予38億/80億小型語言模型（SLMs）強大的能力，特別適用於敏感業務環境，其中數據隱私至關重要。通過在不同數據集上使用靈活且可重複使用的工具，實現了在共享任務中的有效可擴展性，並取得了穩健的性能。實驗表明，MATATA在基於開源模型的推理框架中在FinQA和TAT-QA上達到了最先進的性能。此外，MATATA模型在TabMWP上與基於GPT-4的框架競爭，同時仍然是SLMs。

English

Mathematical reasoning capabilities are increasing with tool-augmented language agents, but methods often rely either on closed-source or large models, external data, or extensive prompt engineering. This work introduces MATATA, a novel cost-effective method to train LLM agents for tabular data problems through reasoning, planning, and tool use. With a progressive self-improvement paradigm and an iterative weak supervision, it empowers 3.8B/8B Small Language Models (SLMs), particularly suited for local hosting and sensitive business contexts where data privacy is crucial. By employing a flexible and reusable tools across different datasets, it achieves robust performance with effective scalability across shared tasks. Experiments show that MATATA reaches state-of-the-art performances on FinQA and TAT-QA among reasoning frameworks based on open-source models. Moreover, MATATA models compete with GPT-4 based frameworks on TabMWP, while being SLMs.

MATATA：一種用於表格應用的弱監督式數學工具輔助推理

MATATA: a weak-supervised MAthematical Tool-Assisted reasoning for Tabular Applications

摘要

Summary

Support