Federated Graph Learning for Distributed Resource Management in Multi-Cluster Computing Environments

Authors

  • Zeyu Wang Department of Computer Science and Engineering, University of California, Santa Cruz, USA Author

DOI:

https://doi.org/10.71465/

Keywords:

Federated Learning, Graph Neural Networks, Resource Management, Multi-Cluster Computing, Distributed Scheduling, Cloud Computing

Abstract

The proliferation of distributed computing infrastructures has necessitated advanced resource management strategies that can operate across heterogeneous multi-cluster environments while preserving data privacy and system autonomy. This paper proposes a novel federated graph learning framework that leverages Graph Neural Networks (GNN) for intelligent resource allocation and scheduling in distributed computing systems. Our approach addresses the fundamental challenges of resource fragmentation, heterogeneous workload characteristics, and inter-cluster communication overhead through a decentralized learning paradigm. The framework constructs dynamic resource graphs representing computational nodes, network topologies, and workload dependencies, enabling collaborative learning across clusters without centralizing sensitive operational data. We introduce a hierarchical architecture inspired by proven distributed systems designs, combining local graph-based resource allocation with federated model aggregation through a master-slave coordination mechanism. The graph representation captures both global network topology and fine-grained local resource states, enabling multi-scale optimization of allocation decisions. We implement distributed model parallelism to achieve scalability across thousands of nodes while maintaining sub-second decision latencies. Experimental evaluation demonstrates that our federated graph learning approach achieves superior performance compared to traditional centralized scheduling methods, reducing average job completion time by 28% and improving overall cluster utilization by 34% across diverse workload scenarios.

Downloads

Download data is not yet available.

Downloads

Published

2025-12-01