Kubernetes Multi-Cloud: Federation, Submariner and Unified Management
76% of enterprise organizations use more than one cloud provider (Flexera State of the Cloud 2026). Some to avoid vendor lock-in, others to comply with requirements data sovereignty (GDPR: European data in Europe), still others for disaster recovery geographical. The result is the same challenge: how to manage multiple Kubernetes clusters distributed across different clouds as if they were a single platform?
In this article we will explore strategies and tools for multi-cloud Kubernetes federation: from Submariner for networking inter-cluster (allow Pods on AWS to talk to Pods on GCP), e.g Admiralty/Liqo for cross-cluster scheduling, up to Fleet Manager e ArgoCD ApplicationSet for management declarative of dozens of clusters from a single control point.
What You Will Learn
- Multi-cluster models: active-active, active-passive, federated
- Submariner: Cross-cluster overlay network between different clouds
- ServiceExport/ServiceImport with MCS API for multi-cluster service discovery
- ArgoCD ApplicationSet with cluster generators for deployment on N clusters
- KubeAdmiral/Liqo: Intelligent cross-cluster workload scheduling
- Rancher Fleet: Centralized management of 1000+ clusters
- Multi-cluster Istio: Unified service mesh across different clusters
- Security and compliance considerations in the K8s multi-cloud
Multi-Cluster Architecture Models
There is no single approach to multi-clustering. The choice depends on the objectives:
| Model | Description | Use Case |
|---|---|---|
| Hub and Spoke | A hub cluster manages spoke clusters via ArgoCD/Flux | Centralized GitOps, policy management |
| Active-Active | Multiple active clusters, traffic distributed with global load balancer | High availability, low latency for global users |
| Active-Passive (DR) | Primary cluster + disaster recovery cluster | Business continuity, RTO/RPO defined |
| Edge + Central | Central cluster + geographic edge clusters | IoT, CDN logic, ultra-low latency |
| Data sovereignty | Cluster by region (EU, US, APAC) for GDPR | Regulatory data compliance |
Submariner: Cross-Cluster Networking
The fundamental problem of multi-clustering is that each cluster has its own IP range for Pods and Services. Pods on cluster-A cannot reach Pods on cluster-B because the networks they are isolated. Submariner solves this by creating a secure overlay network between clusters via IPsec tunnel or WireGuard.
Submariner installation
# Installa subctl (CLI di Submariner)
curl -Ls https://get.submariner.io | VERSION=0.17.0 bash
# Pre-requisiti: i cluster devono potersi raggiungere (IP pubblici o VPN)
# Imposta kubeconfig per entrambi i cluster
export KUBECONFIG=~/.kube/cluster-aws:~/.kube/cluster-gcp
# Prepara il cluster broker (uno dei cluster diventa il "broker" di coordinamento)
subctl deploy-broker --kubeconfig ~/.kube/cluster-aws
# Unisci il cluster AWS al broker (il cluster che diventa broker)
subctl join broker-info.subm --kubeconfig ~/.kube/cluster-aws \
--clusterid cluster-aws \
--cable-driver wireguard # WireGuard per performance, IPsec per compatibilita
# Unisci il cluster GCP al broker
subctl join broker-info.subm --kubeconfig ~/.kube/cluster-gcp \
--clusterid cluster-gcp \
--cable-driver wireguard
# Verifica la connessione inter-cluster
subctl show connections # mostra lo stato dei tunnel tra i cluster
subctl verify --kubeconfig ~/.kube/cluster-aws --toconfig ~/.kube/cluster-gcp --verbose
ServiceExport and ServiceImport: Service Discovery Multi-Cluster
After Submariner connects the networks, you use the Multi-Cluster Services (MCS) API to export Services from one cluster and import them into another:
# Nel cluster GCP: esporta il database Service
# ServiceExport rende il Service visibile agli altri cluster
apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ServiceExport
metadata:
name: postgres-primary
namespace: data-layer
---
# Nel cluster AWS: il Service appare automaticamente come ServiceImport
# (Submariner lo sincronizza tramite il broker)
# Il Service e raggiungibile come:
# postgres-primary.data-layer.svc.clusterset.local
# Test di connettivita cross-cluster
kubectl run -n app-layer --rm -it test-pod \
--image=postgres:16 \
-- psql -h postgres-primary.data-layer.svc.clusterset.local -U app -d mydb
# SubmarineER gestisce automaticamente il routing del traffico:
# Pod in cluster-aws -> tunnel WireGuard -> Pod in cluster-gcp
# con latenza aggiuntiva ~5-10ms (tunnel overhead)
ArgoCD ApplicationSet for Deploy Multi-Cluster
With ArgoCD ApplicationSet and the Cluster Generator, you can deploy the same application on all clusters registered in ArgoCD with a single manifest:
# Registra i cluster in ArgoCD
argocd cluster add cluster-aws --name production-aws
argocd cluster add cluster-gcp --name production-gcp
argocd cluster add cluster-azure --name production-azure
# Verifica cluster registrati
argocd cluster list
---
# applicationset-multi-cluster.yaml
# Deploya la stessa app su tutti i cluster con label environment=production
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: api-service-global
namespace: argocd
spec:
generators:
# Usa il Cluster Generator: genera un'Application per ogni cluster ArgoCD
- clusters:
selector:
matchLabels:
environment: production # filtra cluster con questo label
template:
metadata:
name: 'api-service-{{name}}' # {{name}} = nome del cluster
labels:
cluster: '{{name}}'
spec:
project: production
source:
repoURL: https://github.com/company/k8s-manifests
targetRevision: main
path: apps/api-service/production
kustomize:
# Personalizzazione per cluster specifici
patches:
- target:
kind: Deployment
name: api-service
patch: |-
- op: replace
path: /spec/replicas
value: '{{metadata.annotations.replicas | default "3"}}'
destination:
server: '{{server}}' # URL API server del cluster
namespace: api-service
syncPolicy:
automated:
prune: true
selfHeal: true
Rancher Fleet: Management of 1000+ Clusters
Rancher Fleet (part of the Rancher project) and designed to handle thousands of clusters with a single GitOps control plane. And particularly useful for edge computing (IoT clusters) and for organizations with clusters for each customer:
# Fleet usa il concetto di Bundle: un insieme di manifest da deployare
# e un GitRepo che definisce la sorgente Git
# gitrepo.yaml - registra il repo Git con Fleet
apiVersion: fleet.cattle.io/v1alpha1
kind: GitRepo
metadata:
name: company-apps
namespace: fleet-default
spec:
repo: https://github.com/company/k8s-manifests
branch: main
paths:
- apps/
targets:
# Deploya su tutti i cluster con label env=production
- name: production
clusterSelector:
matchLabels:
env: production
helm:
values:
replicaCount: 3
resources:
requests:
cpu: 250m
# Configurazione diversa per dev
- name: development
clusterSelector:
matchLabels:
env: development
helm:
values:
replicaCount: 1
---
# Verifica stato deploy su tutti i cluster
kubectl get bundles -A
kubectl get bundledeployments -A | grep -v Healthy # mostra solo problemi
Istio Multi-Cluster: Unified Service Mesh
To enable mTLS and traffic management between services on different clusters, Istio supports multi-cluster mode with a shared mesh:
# Istio multi-cluster con Primary-Remote topology
# Il cluster primario ha il control plane, il remote lo usa
# 1. Installa Istio sul cluster primario
istioctl install --set profile=default \
--set meshConfig.trustDomain=company.com \
--set values.pilot.env.EXTERNAL_ISTIOD=true
# 2. Crea il secret con le credenziali del cluster remote
istioctl x create-remote-secret \
--kubeconfig=~/.kube/cluster-gcp \
--name=cluster-gcp | kubectl apply -f -
# 3. Configura il cluster GCP per usare il control plane AWS
helm install istio-remote manifests/charts/istio-control/istio-discovery \
--set profile=remote \
--set values.global.remotePilotAddress=PILOT_EXTERNAL_IP \
-f kubeconfig.yaml
# Con Istio multi-cluster configurato:
# - mTLS automatico tra servizi su cluster diversi
# - Traffic routing cross-cluster con DestinationRule
# - Kiali mostra la topology completa multi-cluster
# VirtualService per routing intelligente multi-cluster:
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
name: api-service
spec:
hosts:
- api-service
http:
- route:
# 70% traffico cluster locale (latenza minore)
- destination:
host: api-service.production.svc.cluster.local
port:
number: 8080
weight: 70
# 30% traffico cluster GCP (per load balancing globale)
- destination:
host: api-service.production.svc.clusterset.local # MCS
port:
number: 8080
weight: 30
Global Load Balancing with Cloudflare and AWS Global Accelerator
# DNS-based Global Load Balancing con failover automatico
# Cloudflare Load Balancing con health checks per ogni cluster
# Regola Terraform per configurare il Global LB:
resource "cloudflare_load_balancer" "global_api" {
zone_id = var.cloudflare_zone_id
name = "api.company.com"
fallback_pool_id = cloudflare_load_balancer_pool.aws.id
default_pool_ids = [
cloudflare_load_balancer_pool.aws.id,
cloudflare_load_balancer_pool.gcp.id
]
steering_policy = "geo" # routing geografico
proxied = true
region_pools {
region = "EEU"
pool_ids = [cloudflare_load_balancer_pool.gcp_eu.id]
}
region_pools {
region = "WNAM"
pool_ids = [cloudflare_load_balancer_pool.aws_us.id]
}
}
resource "cloudflare_load_balancer_pool" "aws" {
name = "k8s-aws-production"
origins {
name = "aws-ingress"
address = var.aws_ingress_ip
enabled = true
weight = 1
}
health_check_id = cloudflare_load_balancer_monitor.http.id
}
resource "cloudflare_load_balancer_monitor" "http" {
type = "http"
path = "/health"
expected_codes = "200"
interval = 60
retries = 2
timeout = 5
}
Multi-Cluster Backup and Disaster Recovery
# Velero per backup cross-cluster
# Installa Velero su tutti i cluster con lo stesso backend S3
helm install velero vmware-tanzu/velero \
--namespace velero \
--create-namespace \
--set configuration.backupStorageLocation[0].name=aws-s3 \
--set configuration.backupStorageLocation[0].provider=aws \
--set configuration.backupStorageLocation[0].bucket=company-k8s-backups \
--set configuration.backupStorageLocation[0].config.region=eu-west-1
# Schedule backup giornaliero
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: daily-backup
namespace: velero
spec:
schedule: "0 2 * * *" # ogni notte alle 2:00
template:
includedNamespaces:
- production
- data-layer
snapshotVolumes: true
ttl: 720h # mantieni per 30 giorni
# Restore su cluster GCP (DR)
velero restore create --from-backup daily-backup-20260901 \
--include-namespaces production \
--kubeconfig ~/.kube/cluster-gcp
Security Considerations in Multi-Cloud Kubernetes
Multi-Cloud Security Checklist
- Isolated cluster credentials: Each cluster has its own service account. Do not share credentials between clusters
- Encrypted inter-cluster communication: Submariner with WireGuard or IPsec guarantees confidentiality of data in transit between clusters
- Unified policies with OPA: Use a centralized OPA/Gatekeeper registry to ensure consistent security policies across all clusters
- Centralized audit log: Aggregate API audit logs from all clusters into a central SIEM
- GDPR Governance: Regularly test that EU data remains in EU clusters using Network Policy and node affinity
- Automatic rotation certificates: With cert-manager on each cluster, automate TLS certificate rotation
Conclusions: The Kubernetes at Scale Path
This is the last article in the Kubernetes at Scale series. In 12 articles we have you covered the full range of enterprise Kubernetes operations: from advanced networking with Cilium eBPF, to storage with StatefulSet and CSI drivers, to Operators for stateful workloads, to autoscaling with Karpenter, to service mesh with Istio, to security with RBAC and OPA, up to multi-cloud with Submariner and Federation.
Multi-cloud Kubernetes is not a destination but a journey: you start with two clusters (maybe production + DR), tools like Submariner and ArgoCD ApplicationSet are added, e.g it scales gradually. The key is to build on a solid foundation — secure networking, GitOps for state consistency, unified observability — before adding multi-cluster complexity.
The entire Kubernetes at Scale Series
- 01 - Kubernetes Networking: CNI, Cilium with eBPF and Network Policy
- 02 - Persistent Storage: CSI, PV, StorageClass and StatefulSet
- 03 - Kubernetes Operators: CRD, Controller Pattern and Operator SDK
- 04 - Autoscaling: HPA, VPA, KEDA and Karpenter
- 05 - Service Mesh: Istio vs Linkerd, mTLS and Traffic Management
- 06 - Security: RBAC, Pod Security Standards and OPA Gatekeeper
- 07 - Multi-Tenancy: Namespace, Resource Quota and HNC
- 08 - AI and GPU Workloads: Device Plugins and Training Jobs
- 09 - FinOps: Rightsizing, Spot Instances and Cost Reduction
- 10 - GitOps with ArgoCD: Declarative Deploy and Progressive Rollout
- 11 - Observability: Prometheus, Grafana and OpenTelemetry
- 12 - Multi-Cloud: Federation, Submariner and Unified Management (this article)
Related Series
- Terraform and Infrastructure as Code — multi-cloud provisioning of Kubernetes clusters
- Edge Computing and Serverless — edge cluster as an extension of multi-cloud
- DevSecOps — policy-as-code and uniform security across multi-cloud clusters







