Temporal Causal Discovery in Evolving Microservice Topologies under Distribution Shift
DOI:
https://doi.org/10.71465/fapm533Keywords:
Temporal causal discovery, microservice architecture, distribution shift, autocorrelation, invariant learning, graph neural networksAbstract
Microservice architectures exhibit highly dynamic behaviors where causal relationships between services evolve continuously as system configurations change and workload distributions shift. Existing causal discovery methods struggle with autocorrelated observational data and distribution shifts simultaneously, leading to unstable detection rates and spurious causal links. This paper proposes TDICD (Temporal Distribution-Invariant Causal Discovery), a framework that combines temporal dependency modeling, invariant pattern recognition across multiple system environments, and graph neural networks to discover stable causal structures in evolving microservice topologies. Our method addresses three key challenges: handling strong autocorrelation in time-series metrics, detecting causal links that remain invariant under environmental perturbations, and adapting to non-stationary distributions as system configurations evolve. Experimental evaluation on synthetic benchmarks and two real-world microservice applications demonstrates that TDICD achieves detection rates exceeding 85% for both weakly and strongly autocorrelated causal links while maintaining false positive rates below 8%. Compared to baseline methods including PCMCI, DYNOTEARS, and standard GNN approaches, TDICD shows 23% improvement in F1-score for contemporaneous link detection and 31% better invariance to distribution shifts.
Downloads
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.