
Friday Jun 13, 2025
Asynchronous Multi-Agent Reinforcement Learning for Scheduling in Subnetworks
We address radio resource scheduling in a network of multiple in-X subnetworks providing wireless Ultra-Reliable Low-Latency Communication (URLLC) service. Each subnetwork is controlled by an agent responsible for scheduling resources to its devices. Agents rely solely on interference measurements for information about other agents, with no explicit coordination. Subnetwork mobility and fast-fading effects create a non-stationary environment, adding to the complexity of the scheduling problem. This scenario is modeled as a multi-agent Markov Decision Process (MDP). To address the problem, we propose a Multi-Agent Deep Reinforcement Learning (MADRL) approach under URLLC constraints, which integrates Long Short-Term Memory (LSTM) with the Deep Deterministic Policy Gradient (DDPG) algorithm to manage non-stationarity and high-dimensional action spaces. We apply an asynchronous update strategy, where one agent is updating at a time. This reduces learning variability, resolves policy conflicts, and improves the interpretability of the MADRL approach. Simulation results demonstrate that the asynchronous update mechanism outperforms synchronous updates and baseline methods, achieving superior reliability, resource utilization, and explainability.
Asynchronous Multi-Agent Reinforcement Learning for Scheduling in Subnetworks
Ashvin Srinivasan, Aalto University; Junshan Zhang, UC Davis; Olav Tirkkonen, Aalto University
No comments yet. Be the first to say something!