
Friday Jun 13, 2025
Coded Distributed Computing over Multi-server Clustered Network for Federated Learning
In this paper, we focus on the application of coded distributed computing (CDC) in a multi-server clustered network (MSCN), which is designed to accelerate the gradient update process in federated learning (FL) by considering both communication and computational heterogeneity. As the number of participating devices increases and resource heterogeneity becomes more pronounced, reducing total execution latency (TEL) has become a critical challenge. To address this issue, we focus on optimizing the matching between heterogeneous devices and server task loads to improve resource utilization while enhancing the robustness and fault tolerance of the FL system. To minimize TEL, we propose a greedy algorithm and an iter-genetic algorithm for device assignment, named GADA and IGADA, based on the task allocation for the single server (TASS) algorithm, respectively. Based on simulation results and theoretical analysis, we confirm that our proposed algorithms substantially reduce the TEL in various scenarios compared to existing CDC methods, with complexity markedly lower than that of the exhaustive scheme.
Coded Distributed Computing over Multi-server Clustered Network for Federated Learning
Wenjing Mou, Harbin Institute of Technology, Shenzhen; Shushi Gu, Harbin Institute of Technology (Shenzhen); zhikai zhang, Pengcheng Laboratory; Guixiang Lei, Harbin Institute of Technology (Shenzhen); Zhang Qinyu, Harbin Institute of Tech.; Wei Xiang, La Trobe University
No comments yet. Be the first to say something!