Friday Jun 13, 2025

Provable Performance Bounds for Digital Twin-driven Deep Reinforcement Learning in Wireles...

Digital twin (DT)-driven deep reinforcement learning (DRL) has emerged as a promising paradigm for wireless network optimization, offering safe and efficient training environment for policy exploration. However, in theory existing methods can hardly guarantee real-world performance of DT-trained policies before actual deployment. In this paper, we propose the DT bisimulation metric (DT-BSM), a novel metric based on the Wasserstein distance, to quantify the discrepancy between Markov decision processes (MDPs) in both the DT and the corresponding real-world wireless network environment. We prove that for any DT-trained policy, the sub-optimality of its performance (regret) in the real-world deployment is bounded by a weighted sum of the DT-BSM and its sub-optimality within the MDP in the DT, and a modified DT-BSM based on the total variation distance is introduced to avoid the prohibitive calculation complexity of Wasserstein distance for large-scale wireless network scenarios. Numerical experiments validate this first theoretical finding on the provable and calculable performance bounds for DT-driven DRL.

Provable Performance Bounds for Digital Twin-driven Deep Reinforcement Learning in Wireless Networks: A Novel Digital Twin Evaluation Metric

Zhenyu Tao, Wei Xu, Xiaohu You, Southeast University

Comment (0)

No comments yet. Be the first to say something!

Copyright 2025 All rights reserved.

Version: 20241125