FLAG-Trader：融合大语言模型与梯度强化学习的智能金融交易代理

摘要

基于多模态金融数据微调的大型语言模型（LLMs）已在多种金融任务中展现出卓越的推理能力。然而，在交互式金融市场中，如交易这类需要复杂代理策略以优化决策的多步骤、目标导向场景中，它们往往表现欠佳。为此，我们提出了FLAG-Trader，一种统一架构，它将语言处理（通过LLMs）与梯度驱动的强化学习（RL）策略优化相结合。在此架构中，部分微调的LLM充当策略网络，既利用预训练知识，又通过参数高效微调适应金融领域。通过交易奖励驱动的策略梯度优化，我们的框架不仅提升了LLM在交易中的表现，还改善了其在其他金融领域任务上的成果。我们提供了详尽的实证证据来验证这些改进。

English

Large language models (LLMs) fine-tuned on multimodal financial data have demonstrated impressive reasoning capabilities in various financial tasks. However, they often struggle with multi-step, goal-oriented scenarios in interactive financial markets, such as trading, where complex agentic approaches are required to improve decision-making. To address this, we propose FLAG-Trader, a unified architecture integrating linguistic processing (via LLMs) with gradient-driven reinforcement learning (RL) policy optimization, in which a partially fine-tuned LLM acts as the policy network, leveraging pre-trained knowledge while adapting to the financial domain through parameter-efficient fine-tuning. Through policy gradient optimization driven by trading rewards, our framework not only enhances LLM performance in trading but also improves results on other financial-domain tasks. We present extensive empirical evidence to validate these enhancements.

FLAG-Trader：融合大语言模型与梯度强化学习的智能金融交易代理

FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading

摘要

Summary

Support