Kickoff is a microservices-based tournament management platform designed and deployed as a production-style system on AWS with Kubernetes, Git-based continuous delivery, and end-to-end automation.
This CS302 project builds on the earlier CS203-Kickoff submission, reusing the same functional domain (community-led football tournaments in Singapore) while focusing here on DevOps, cloud architecture, and Kubernetes. For detailed application features, business rules, and functional behavior, refer to the CS203 repository’s README and documentation.
- High-Level Architecture
- Core Platform Components
- Infrastructure as Code
- Continuous Integration
- Continuous Delivery & GitOps
- End-to-End Deployment Workflow
- Kubernetes Deployment & Environment Strategy
- DevOps Practices and Operational Excellence
- Self-Directed Research and Tooling Choices
- Operations, Observability, and Reliability
- Team
The platform follows a cloud-native microservices architecture with separate concerns for compute, data, messaging, and networking.
The AWS deployment uses managed services to provide a realistic production environment:
- Amazon EKS (Elastic Kubernetes Service)
- Runs all microservices as Kubernetes deployments and services.
- Integrates with AWS IAM for cluster access.
- Amazon RDS for MySQL
- Managed relational database for core transactional data (users, clubs, tournaments, payments, etc.).
- Amazon S3
- Object storage for file uploads (for example, verification documents, images).
- Amazon Elastic Load Balancer (NLB)
- Exposes the frontend and public APIs from the EKS cluster.
- VPC, Subnets, and Security Groups
- Private subnets for EKS worker nodes and RDS.
- Public subnets for load balancers only.
- Secrets Management
- Application secrets (database credentials, Stripe keys, SMTP, JWT) injected into Kubernetes from a centralized store.
The Kubernetes layer models each business capability as a separate microservice and standardizes deployment patterns:
- Microservices (apps in this repo)
kickoff-users– authentication and user management.kickoff-clubs– club lifecycle management.kickoff-tournaments– tournament creation, scheduling, and participation.kickoff-scheduler– cron-based background jobs for reminders.kickoff-notifications– email notifications using RabbitMQ events.kickoff-payments– payment processing (Stripe) and webhook ingestion.kickoff-verifications– verification workflows.kickoff-frontend– React/Vite SPA served via NGINX.
- Infrastructure services (under
kickoff-k8s-infraandkickoff-k8s-apps)- Namespaced, Kustomize-based manifests for each microservice.
rabbitmqas a message broker inside the cluster.- Istio/ingress configuration, Argo CD / GitOps resources, and observability stack (Grafana, Jaeger, Kiali).
At a high level, the platform consists of:
- Application Services – Java/Spring Boot microservices plus a Python notifications worker and a React frontend.
- Data & Messaging – MySQL/RDS for persistence and RabbitMQ for event-driven workflows.
- Identity & Security – JWT-based auth, HTTPS ingress, and secret management for credentials and third-party APIs (Stripe, SMTP, etc.).
- CI/CD & GitOps – Automated build, test, image publishing, and environment reconciliation from Git.
- Infrastructure as Code – AWS resources and Kubernetes cluster configuration described declaratively.
Each folder in the repo corresponds to a specific responsibility:
kickoff-terraform/– Terraform modules and root config to provision AWS infrastructure.kickoff-k8s-infra/– Cluster-level components (ingress, cert-manager, observability, Argo CD/Kargo, etc.).kickoff-k8s-apps/– Application manifests (deployments, services, Kustomize overlays per microservice).kickoff-frontend/,kickoff-users/,kickoff-clubs/, etc. – Application codebases with Dockerfiles.kickoff-e2e-tests/– Playwright end-to-end test suite executed as part of CI.
All cloud resources and most cluster configuration are defined declaratively:
- Terraform (
kickoff-terraform/)- Provisions VPC, subnets, Internet/NAT gateways, security groups.
- Creates the EKS cluster and node groups.
- Manages RDS, S3 buckets, and other AWS resources required by the platform.
- Exposes key outputs (cluster endpoint, kubeconfig data, DB endpoints, etc.).
- Kubernetes Manifests (
kickoff-k8s-infra/,kickoff-k8s-apps/)- Uses Kustomize for composable overlays.
- Defines deployments, services, config maps, secrets (via external secret operators), ingress, and observability stack.
This combination enables repeatable, versioned, and reviewable changes to both AWS infrastructure and in-cluster configuration.
The project is wired with a full GitLab CI/CD pipeline that validates changes from commit to container image:
- Static analysis & formatting
- Java services: Checkstyle, unit tests via Maven.
- Python service:
pytestand linting using the dependencies inrequirements-dev.txt. - Frontend: TypeScript checks, ESLint, and Vite build.
- Docker image builds
- Each service has its own
Dockerfileoptimized for production images. - CI builds images in parallel where possible and tags them with commit SHA and environment tags.
- Each service has its own
- Security and quality gates
- Build must succeed for backend, frontend, and infra checks before deployment is allowed.
- End-to-end tests (kickoff-e2e-tests)
- Playwright tests (
kickoff-e2e-tests/tests/**) hit the running environment and verify core user journeys.
- Playwright tests (
CI Workflows (GitLab CI/CD)
Application and infrastructure deployment is implemented using a GitOps model:
- Source of truth in Git
- Kubernetes manifests for apps live in
kickoff-k8s-apps/. - Cluster/infrastructure manifests (ingress, cert-manager, monitoring, Argo CD/Kargo, etc.) live in
kickoff-k8s-infra/.
- Kubernetes manifests for apps live in
- Environment-specific overlays
- Kustomize overlays define environment-specific differences (image tags, resource limits, endpoints).
- Argo CD / Kargo integration
- Watches the Git repositories for changes in manifests.
- Syncs updates into the EKS cluster, providing automated rollouts and rollbacks.
- Promotion flow
- CI pushes images to the container registry (for example, Amazon ECR).
- A change to the image tag or configuration in
kickoff-k8s-appstriggers Argo CD/Kargo to roll out the new version.
This approach decouples build (CI) from deploy (GitOps) while keeping both fully automated and auditable.
Bringing everything together, a typical end-to-end flow looks like this:
- Developer pushes code to main or opens a merge request.
- GitLab CI/CD pipeline runs:
- Unit tests, linting, and builds for all services.
- Docker images are built and pushed (tagged by commit SHA).
- Playwright end-to-end tests (
kickoff-e2e-tests) validate core flows against a test environment.
- Manifests are updated:
- Image tags or configuration changes are committed to
kickoff-k8s-apps/(manually or via automation).
- Image tags or configuration changes are committed to
- GitOps controller reconciles state:
- Argo CD/Kargo detects manifest changes and deploys to the EKS cluster.
- Rollouts are monitored; if health checks fail, the system can roll back.
- Traffic is routed via ingress and load balancers to the updated services.
The platform includes a set of operational tooling to support troubleshooting and performance insights:
- Metrics and Dashboards
- Grafana dashboards for key service metrics.
- Distributed Tracing
- Jaeger integration to trace cross-service requests.
- Service Mesh / Traffic Management
- Istio/Kiali for traffic routing, mTLS (where enabled), and visualization of service-to-service calls.
- Access and Diagnostics
kubectl port-forwardto internal dashboards such as Argo CD, Grafana, Kiali, Jaeger, and Kargo.- Logs available via
kubectl logsand, in a production setup, a centralized logging solution.
Examples of accessing cluster tooling (assuming kubectl is configured):
- Argo CD:
kubectl port-forward svc/argocd-server -n argocd 8080:443
- Grafana:
kubectl port-forward -n istio-system svc/grafana-staging 3000:3000
- Kiali:
kubectl port-forward -n istio-system svc/kiali 20001:20001
- Jaeger:
kubectl port-forward -n istio-system svc/jaeger 16686:16686
- Kargo:
kubectl port-forward -n kargo svc/kargo-api 8081:80
Beyond simply running workloads on EKS, the platform is structured to exercise realistic deployment strategies and multi-environment workflows:
- Multi-environment overlays
- Uses Kustomize overlays to represent different environments (for example,
dev,staging,prod). - Each environment customizes image tags, replica counts, resource limits, and certain configuration values without duplicating manifests.
- Uses Kustomize overlays to represent different environments (for example,
- Rolling updates with health checks
- Services are deployed as
Deploymentswith readiness and liveness probes so that new versions must become healthy before receiving traffic. - Rolling update settings ensure pods are gradually replaced, demonstrating zero-downtime deployment patterns under normal operation.
- Services are deployed as
- Separation of concerns
- Stateless services (microservices and frontend) are deployed as scalable replicas, while stateful components (RDS, RabbitMQ) are provisioned via managed services or separate manifests.
This design shows proficiency in structuring Kubernetes applications for safe rollouts, environment isolation, and operational flexibility.
The project is intentionally built as a DevOps case study, emphasizing automation, observability, and safe change management:
- GitLab-first automation
- All key workflows (build, test, package, infrastructure validation) are encoded as GitLab CI pipelines instead of manual commands.
- Pipelines run on every merge request and on
main, enforcing a consistent quality bar before any deployment can proceed.
- Shift-left testing
- Unit tests, integration tests, and Playwright end-to-end tests are integrated into the same CI system to catch issues as early as possible.
- GitOps and auditable operations
- Day-to-day operational changes (image upgrades, config tweaks, scaling parameters) are made by editing Git manifests, not by ad-hoc
kubectlcommands. - Argo CD/Kargo keep the cluster state in sync with Git, providing a clear history of who changed what and when.
- Day-to-day operational changes (image upgrades, config tweaks, scaling parameters) are made by editing Git manifests, not by ad-hoc
- Rollback strategy
- If a release misbehaves, rollbacks are performed by reverting Git commits or updating image tags in manifests, triggering Argo CD/Kargo to restore the previous version.
Taken together, these practices demonstrate proficiency with modern DevOps workflows around CI/CD, GitOps, and safe, observable change management on Kubernetes.
Many of the platform’s capabilities are the result of self-directed research into modern cloud-native tooling. Beyond the course requirements, the team evaluated and adopted a combination of open-source projects that work together:
- Argo CD for GitOps
- Replaces ad-hoc
kubectlapply commands with a Git-driven deployment model where Git is the single source of truth. - Auto-sync,
prune, andselfHealensure that removed resources are cleaned up and any manual cluster changes are reverted back to the desired state, eliminating configuration drift. - The Argo CD dashboard provides a real-time view of application health and sync status across environments, simplifying troubleshooting and day-to-day operations.
- Replaces ad-hoc
- Kargo + Playwright for automated progressive delivery
- Kargo continuously watches ECR for new images, models them as "Freight", and promotes them through
dev → staging → prodby updating Git branches dedicated to each environment. - Only Kargo has permission to modify the
stagingandprodbranches in thekickoff-k8s-appsrepo, enforcing a single, automated path to production and preventing manual drift. - Before promotion, Kargo waits for Argo CD to report healthy, in-sync applications and then runs Playwright end-to-end tests as Kubernetes Jobs, using test-only endpoints (for notifications, scheduler control, and DB reset) to deterministically validate asynchronous workflows.
- Kargo continuously watches ECR for new images, models them as "Freight", and promotes them through
- Istio + Kiali for service mesh and deep observability
- Istio sidecars, mTLS, AuthorizationPolicies, and VirtualServices were researched and configured to secure and control service-to-service traffic, especially important with staging and prod sharing a cluster.
- Kiali’s service graph, error highlighting, and integration with Prometheus, Grafana, Loki, and Jaeger are used to debug choreography-heavy scenarios and trace failing requests end-to-end.
These technology choices were not required by default; they were deliberately introduced, tuned, and integrated by the team to explore real-world DevOps patterns such as GitOps, progressive delivery, service meshes, and production-grade observability.




