Cooperative Q-learning techniques for distributed online power allocation in femtocell networks

By

Saad H.

Mohamed A.

ElBatt T.

Software and Communications

In this paper, we address the problem of distributed interference management of femtocells that share the same frequency band with macrocells using distributed multi-agent Q-learning. We formulate and solve two problems representing two different Q-learning algorithms, namely, femto-based distributed and sub-carrier-based distributed power controls using Q-learning (FBDPC-Q and SBDPC-Q). FBDPC-Q is a multi-agent algorithm that works on a global basis, for example, deals with the aggregate macrocell and femtocell capacities. Its complexity increases exponentially with the number of sub-carriers in the system. Also, it does not take into consideration the sub-carrier macrocell capacity as a constraint. To overcome these problems, SBDPC-Q is proposed, which is a multi-agent algorithm that works on a sub-carrier basis, for example, sub-carrier macrocell and femtocell capacities. Each of FBDPC-Q and SBDPC-Q works in three different learning paradigms: independent (IL), cooperative (CL), and weighted cooperative (WCL). IL is considered the simplest form for applying Q-learning in multi-agent scenarios, where all the femtocells learn independently. CL and WCL are the proposed schemes in which femtocells share partial information during the learning process in order to strike a balance between practical relevance and performance. We prove the convergence of the CL paradigm when used in the FBDPC-Q algorithm. We show via simulations that the CL paradigm outperforms the IL paradigm in terms of the aggregate femtocell capacity, especially in networks with large number of femtocells and large number of power levels. In addition, we propose WCL to address the CL limitations. Finally, we evaluate the robustness and scalability of both FBDPC-Q and SBDPC-Q, against several typical dynamics of plausible wireless scenarios (fading, path loss, random activity of femtocells, etc.). We show that the CL paradigm is the most scalable to large number of femtocells and robust to the network dynamics compared with the IL and WCL paradigms. Copyright © 2014 John Wiley & Sons, Ltd.