Cost-Efficient Reinforcement Learning for Optimal Trade Execution on Dynamic Market Environment
Citations Over Time
Abstract
Learning a high-performance trade execution model via reinforcement learning (RL) requires interaction with the real dynamic market. However, the massive interactions required by direct RL would result in a significant training overhead. In this paper, we propose a cost-efficient reinforcement learning (RL) approach called Deep Dyna-Double Q-learning (D3Q), which integrates deep reinforcement learning and planning to reduce the training overhead while improving the trading performance. Specifically, D3Q includes a learnable market environment model, which approximates the market impact using real market experience, to enhance policy learning via the learned environment. Meanwhile, we propose a novel state-balanced exploration scheme to solve the exploration bias caused by the non-increasing residual inventory during the trade execution to accelerate model learning. As demonstrated by our extensive experiments, the proposed D3Q framework significantly increases sample efficiency and outperforms state-of-the-art methods on average trading cost as well.
Related Papers
- → Optimization of the parallel semi-Lagrangian scheme to overlap computation with communication based on grouping levels in YHGSM(2023)5 cited
- → A reinforcement learning approach to power system stabilizer(2009)16 cited
- → Autonomous lane keeping based on approximate Q-learning(2017)13 cited
- RESEARCH ON MARKOV GAME-BASED MULTIAGENT REINFORCEMENT LEARNING MODEL AND ALGORITHMS(2000)
- → Customized Dynamic Pricing for Air Cargo Network via Reinforcement Learning(2020)1 cited