In this paper we propose an algorithm for dynamic spectrum access (DSA) in LTE cellular systems – distributed ICIC accelerated Q-learning (DIAQ). It combines distributed reinforcement learning (RL) and standardized inter-cell interference coordination (ICIC) signalling in the LTE downlink, using the framework of heuristically accelerated RL (HARL). Furthermore, we present a novel Bayesian network based approach to theoretical analysis of RL based DSA. It explains a predicted improvement in the convergence behaviour achieved by DIAQ, compared to classical RL. The scheme is also assessed using large scale simulations of a stadium temporary event network.
Compared to a typical heuristic ICIC approach, DIAQ provides significantly better quality of service and supports considerably higher network throughput densities. In addition, DIAQ dramatically improves initial performance, speeds up convergence and improves steady state performance of a state-of-the-art distributed Q-learning algorithm, confirming the theoretical predictions. Finally, our scheme is designed to comply with the current LTE standards. Therefore, it enables easy implementation of robust distributed machineintelligence for full self-organisation in existing commercial networks.