Deep reinforcement learning based channel selection strategies for cognitive radio VANET | Nishu Gupta | Motilal Nehru National Institute of Technology, India |

Wireless, Telecommunication & IoT

September 13-14, 2021

Connect and Communicate the Trends of Technology

Nishu Gupta

Motilal Nehru National Institute of Technology, India

Title: Deep reinforcement learning based channel selection strategies for cognitive radio VANET

Biography

Biography: Nishu Gupta

Abstract

Statement of the Problem: Channel selection is a challenging task in cognitive radio vehicular networks. Vehicles have to sense the channels periodically. Due to this, a lot of time is wasted in this delay-intolerant network. This time can be utilized for transmission of data. Employing road side units (RSUs) in sensing can prove to be useful for this purpose. The RSUs may select the channel and allocate to the vehicles on demand. However, this sensing should be proactive. RSUs should know in advance the channel to be allocated when requested. For this purpose, a deep learning algorithm namely DLOCS is proposed in this paper for training the network according to the previously sensed data. Proposed protocol is simulated and results are compared with the existing methods. The packet delivery ratio is increased by 2%, throughput is increased by 1.8%, average delay is decreased by 2% and primary user collision ratio is reduced by 3.2% when compared to similar recent works.

Methodology & Theoretical Orientation: The CR-VANET scenario is shown in figure 1. It consists of RSUs and vehicles. These RSUs are placed along roadsides at some particular distance. Vehicles are equipped with on board units (OBUs) which consist of transceivers. With the help of this OBU, vehicles communicate with RSUs and neighboring vehicles. It uses IEEE802.11p and WAVE1609.4 standards. The transceivers are used to communicate between RSUs and vehicles. These RSUs periodically sense the channels to get information about cognitive channel availability. RSU computes the optimal CR channel after the processing of the sensed data. For the purpose of synchronization with the coordinated universal time (UTC), OBU contains GPS. The primary users (PUs) become active in a controlled random way since they are related to the activities of previous days. It is so because road traffic is not completely random at different days. After processing this data and training of the network, RSU selects the optimal channel. This channel is provided to the vehicles upon request.

Findings: The simulation of DLOCS is done using network simulator-2.34. Patches for VANET and cognitive radio simulation are used for simulation. Open Street Map and SUMO are used for generation of the simulation scenario. The generation of sensing data for different channels is done in a controlled random way. Matrix is generated on per minute basis. Columns show the data of each second during that particular minute. The elements of the columns are 1 with the probability p. This p is kept constant for next 60 minutes. It changes at 61st minute and remains constant for next 60 minutes. The value of p is high during peak hours (minutes 481-600 and 961-1140). Using this data, RSU calculates different parameters.

Conclusion & Significance: Avoiding collision of PUs’ transmission with SUs’ transmission is an important challenge in CR VANET. It requires intelligently processing of previous channel occupancy data. In this work, a deep learning approach is used to train the network to select the CR channel on which the probability of channel occupancy by PU is Minimum. For this purpose, first the sensing data is processed to calculate the mean value of channel occupancy per minute, consistency of channel occupancy per minute and the consistency of channel occupancy in adjacent minutes. These values are given as initial input to train the network. The network then learns about different channels according to the PU arrival during transmission. If PU arrives during transmission the reward is negative, and if the PU does not arrive, the reward is positive. DLOCS shows an improvement in comparison to existing methods. A comparison with most recent work shows that the PDR is increased by 2%, PU collision is reduced by 3.2% and delay is reduced by 2%.