Transforming an open-source reference platform into a production-ready, secure, and observable multi-cloud architecture
This project demonstrates the evolution of upbound/platform-ref-multi-k8s from a basic Crossplane reference into a complete enterprise platform, featuring:
- Multi-cloud infrastructure governed by a single control plane
- Security-first design with defense-in-depth (Cilium, Istio, Gatekeeper, Trivy)
- GitOps workflows with ArgoCD
- Multi-stage CI/CD pipelines (Tekton + optimized Dockerfiles)
- Event-driven autoscaling (KEDA)
- Full observability (Prometheus, Grafana, Loki, Tempo)
- Real workload: scalable-chatapp running across multiple clouds
Most tutorials show toy examples in single clouds. This project tackles real enterprise challenges:
- How do you manage MongoDB and Redis across AWS/GCP/Azure?
- How do you ensure security compliance before code reaches production?
- How do you scale WebSocket workloads dynamically?
- How do you maintain observability across distributed systems?
- Crossplane: Control plane for multi-cloud resource provisioning
- Managed Services: DocumentDB (AWS), Cloud Memorystore (GCP), CosmosDB (Azure)
- GitOps: ArgoCD for declarative deployments
- RBAC + Namespaces: Environment isolation (dev/staging/prod)
- Cilium + NetworkPolicies: Default-deny networking with label selectors
- Istio mTLS (STRICT): Encrypted service mesh traffic
- Gatekeeper (OPA): Policy enforcement (no :latest tags, runAsNonRoot, required labels)
- Trivy: Container image and IaC scanning in pipelines
- Tekton Pipelines:
- Stage 1: Build optimized image (multi-stage Dockerfile)
- Stage 2: Security scan (Trivy)
- Stage 3: Load test (k6)
- Stage 4: Sign & push to registry
- Stage 5: Deploy via ArgoCD
- Dockerfiles: Build in full image, run in distroless
- KEDA: Event-driven autoscaling (Redis pub/sub lag, CPU metrics)
- Karpenter (demo): Dynamic node provisioning
- Descheduler: Pod optimization on saturated nodes
- Prometheus + Grafana: Federated metrics across clusters
- Loki + Tempo: Distributed logs and traces with end-to-end correlation
- Dashboards: SLOs, security posture, autoscaling behavior, CI/CD metrics
- Base App: Distributed chat (Node.js, Socket.IO, MongoDB, Redis)
- Multi-cloud Deployment: Replicas across clusters, load-balanced by Istio
- Crossplane-Managed: DBs and queues provisioned per environment
platform-ref-multi-k8s/
├── apis/
│ ├── cluster/ # Crossplane XRDs for K8s clusters
│ ├── database/ # NEW: MongoDB compositions (DocumentDB, CosmosDB)
│ └── cache/ # NEW: Redis compositions (ElastiCache, Memorystore)
├── cluster/
│ ├── gitops/ # NEW: ArgoCD installation
│ ├── security/ # NEW: Cilium, Istio, Gatekeeper configs
│ ├── observability/ # NEW: Prometheus, Grafana, Loki stack
│ └── app/ # NEW: Chat app manifests
├── pipelines/
│ ├── tekton/ # NEW: CI/CD pipeline definitions
│ └── dockerfiles/ # NEW: Multi-stage Dockerfiles
├── examples/
│ ├── aws-cluster.yaml
│ ├── gcp-cluster.yaml
│ └── chatapp-claim.yaml # NEW: App resource claims
├── tests/
│ └── k6/ # NEW: Load testing scenarios
└── docs/
├── architecture/ # NEW: Diagrams and design decisions
├── security/ # NEW: Compliance and threat model
└── runbooks/ # NEW: Troubleshooting guides
- Fork baseline + project documentation
- Crossplane compositions for managed databases (DocumentDB, CosmosDB)
- Crossplane compositions for managed cache (ElastiCache, Memorystore)
- ArgoCD installation + GitOps structure
- Fork scalable-chatapp
- Add healthchecks and graceful shutdown
- Multi-stage Dockerfile (Node.js → distroless)
- Basic Kubernetes manifests (Deployment, Service, Ingress)
- Tekton installation
- Build pipeline (multi-stage Docker build)
- Security scanning (Trivy for images + IaC)
- k6 load testing stage
- ArgoCD integration (automated sync on pipeline success)
- RBAC policies per namespace
- Cilium installation + default-deny NetworkPolicies
- Istio service mesh with mTLS STRICT
- Gatekeeper policy pack (OPA)
- SealedSecrets or Vault integration
- Prometheus + Grafana stack
- Loki for log aggregation
- Tempo for distributed tracing
- Custom dashboards (app SLOs, security metrics, autoscaling)
- KEDA installation
- Redis scaler for chat backend
- CPU-based scaler for stateless components
- Karpenter demo (AWS-specific node autoscaling)
- Descheduler for pod optimization
- EKS cluster composition + deployment
- GKE cluster composition + deployment
- Cross-cluster Istio mesh (multi-primary)
- Load testing across regions
- Disaster recovery with Velero
- Cost analysis dashboard (Kubecost/OpenCost)
- Chaos engineering experiments (Litmus Chaos)
- Complete documentation + architecture diagrams
- Kubernetes cluster (kind/minikube for local testing)
- kubectl, helm, crossplane CLI
- AWS/GCP/Azure credentials configured
# 1. Install Crossplane
kubectl apply -f cluster/crossplane/
# 2. Configure cloud providers
kubectl apply -f examples/provider-config-aws.yaml
# 3. Deploy ArgoCD
kubectl apply -k cluster/gitops/argocd/
# 4. Create a cluster claim
kubectl apply -f examples/aws-cluster.yaml
# 5. Deploy the chat app
kubectl apply -f examples/chatapp-claim.yaml- Uptime SLO: 99.9% availability target
- Latency P95: < 200ms for message delivery
- Autoscaling: KEDA triggers at 80% Redis pub/sub lag
- Security: 100% Gatekeeper policy compliance
- Cost: Per-namespace resource utilization tracking
k6 run tests/k6/websocket-spike.js \
--vus 500 \
--duration 5m# Scan Dockerfiles
trivy config pipelines/dockerfiles/
# Scan running images
trivy image chatapp:latest# Pod deletion experiment
kubectl apply -f tests/chaos/pod-delete.yamlSee CONTRIBUTING.md for:
- Commit message conventions
- Branch naming strategy
- PR review process
Each major architectural decision is documented in docs/architecture/decisions/ using ADRs (Architecture Decision Records).
Key documents:
- ADR-001: Why Crossplane over Terraform
- ADR-002: Flux vs ArgoCD selection
- ADR-003: Multi-stage pipeline design
Current Phase: Foundation (Phase 1)
Last Updated: December 2025
Production Ready: Target Q1 2026
This project extends upbound/platform-ref-multi-k8s under Apache 2.0.
Built with: Crossplane • ArgoCD • Tekton • Istio • KEDA • Prometheus • k6