A Multi-Model Dynamic Selection Framework Using Deep Contextual Bandits for Urban Traffic Flow Prediction in Large-Scale Road Networks
Abstract
To address the challenge of model selection in large-scale traffic flow prediction tasks, this paper proposes a dynamic multi-model selection framework based on Deep Contextual Bandits (DCB). Centered on the optimal combination of sub-models, the framework leverages contextual information of road segments to select dynamically among candidate predictors, achieving more efficient and accurate traffic flow prediction. Several mechanisms are introduced to improve strategy learning and convergence, including a baseline network, experience replay, double-model estimation, and prioritized experience sampling. A clustering-based strategy is further designed to reduce the search space and enhance the generalization and transferability. Experiments on real-world traffic datasets demonstrate that the proposed framework significantly outperforms traditional static fusion methods, reinforcement learning (RL) baselines, and mainstream spatiotemporal prediction models. In particular, the framework yields a 1.0% improvement in R2 and a 3.2% reduction in MAE compared to state-of-the-art baselines, while reducing inference time by 43.1%. Moreover, the proposed framework shows strong capability in adaptive model selection under varying contexts, with ablation studies confirming the effectiveness of its key components.