Optimal cooperative cognitive relaying and spectrum access for an energy harvesting cognitive radio: Reinforcement learning approach
In this paper, we consider a cognitive setting under the context of cooperative communications, where the cognitive radio (CR) user is assumed to be a self-organized relay for the network. The CR user and the primary user (PU) are assumed to be energy harvesters. The CR user cooperatively relays some of the undelivered packets of the PU. Specifically, the CR user stores a fraction of the undelivered primary packets in a relaying queue (buffer). It manages the flow of the undelivered primary packets to its relaying queue using the appropriate actions over time slots. Moreover, it has the decision of choosing the used queue for channel accessing at idle time slots (slots where the PU's queue is empty). It is assumed that one data packet transmission dissipates one energy packet. The optimal policy changes according to the primary and CR users arrival rates to the data and energy queues as well as the channels connectivity. The CR user saves energy for the PU by taking the responsibility of relaying the undelivered primary packets. It optimally organizes its own energy packets to maximize its payoff as time progresses. © 2015 IEEE.