Multi-Agent Multi-Armed Bandit Learning for Grant-Free Access in Ultra-Dense IoT Networks

DOI: 10.1109/tccn.2024.3366908 Publication Date: 2024-02-19T20:13:57Z
ABSTRACT
Meeting the diverse quality-of-service (QoS) requirements in ultra-dense Internet of Things (IoT) networks operating under varying network loads is challenging. Moreover, latency-critical IoT applications cannot afford excessive control signaling overheads caused by centralized access methods. A distributed approach can potentially address this problem. In regard, multi-agent multi-armed bandit (MAB) learning a promising tool for designing protocols. This paper proposes MAB learning-based grant-free mechanism networks, where multiple base stations (BSs) serve massive delay-sensitive and delay-tolerant devices. Delay-sensitive devices are prioritized to choose BSs with larger numbers channels probabilistic manner. The proposed enables improve their BS selection over time accommodate maximum number that meet prescribed latency-reliability criterion. Simulation results show outperforms random strategy which end do not employ any scheme adapt dynamics.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (30)
CITATIONS (2)