A hands-on learning project demonstrating production-ready, highly-available AWS infrastructure deployment using Terraform. This repository implements the same containerized NestJS application using two different orchestration approaches: AWS ECS (simpler, AWS-native) and AWS EKS (Kubernetes-based, cloud-agnostic).
| Document | Description |
|---|---|
| EKS vs ECS Comparison | Comprehensive comparison of AWS ECS and EKS orchestration approaches, including key differences, component mappings, and decision guidance |
| Terraform Testing | Testing framework documentation covering unit tests, validation tests, CI/CD integration, and best practices for both ECS and EKS implementations |
| Variable Validation Rules | Complete reference of input validation rules, patterns, and examples for catching configuration errors early in both implementations |
- Motivation & Learning Goals
- What You'll Learn
- Application Overview
- Infrastructure Approaches
- Choose Your Path
- Developer Setup: Pre-commit Hooks
- Quick Start
- Project Structure
- Known Limitations
- Contributing
- License
- Questions or Issues?
This project was built as a practical, hands-on learning experience to master Infrastructure as Code (IaC) and cloud-native deployment patterns. The primary goals are:
- Master Terraform: Learn to define, version, and manage cloud infrastructure as code
- Understand AWS Services: Gain practical experience with VPC, ECS, EKS, ALB, Route53, ACM, ECR, and IAM
- Compare Orchestration Approaches: Understand the trade-offs between AWS ECS and Kubernetes (EKS)
- Build Reusable Modules: Create production-ready Terraform modules that follow best practices
- Implement High Availability: Design fault-tolerant architecture across multiple availability zones
- Automate with CI/CD: Use GitHub Actions for consistent, repeatable infrastructure deployment
This repository demonstrates complete, working implementations of both approaches, allowing you to learn by example and experimentation.
By working through this project, you'll gain practical experience with:
Core Infrastructure Concepts:
- Resources deployed across multiple availability zones (AZ)
- Multi-AZ VPC architecture with public/private subnets
- Containers run in private subnets with NAT Gateway
- Container orchestration with ECS and Kubernetes
- Load balancing
- HTTPS with SSL/TLS - Automatic certificate validation via ACM
- DNS management and domain routing
- Least-privilege network access control via security groups
- Service-to-service authorization without credentials with IAM role-based access control
- Auto-scaling for both compute and containers
Terraform Best Practices:
- Module architecture (root modules vs child modules)
- Remote state management with S3
- Environment-specific configuration (dev vs prod)
- Terraform automated testing for all modules
- Auto-generated documentation with terraform-docs
- Staged deployment with dependencies
AWS Services Deep Dive:
- ECS: Clusters, Services, Tasks, Capacity Providers
- EKS: Control plane, Node groups, AWS Load Balancer Controller
- Kubernetes resources via Terraform provider
- ECR with lifecycle policies
- Application Load Balancer with target groups
- Route53 and ACM certificate validation
DevOps Practices:
- GitHub Actions workflows for infrastructure automation (deployment and teardown)
- Pre-commit hooks for code quality, security, and formatting enforcement
- AWS ECR-based Docker container registry with automated lifecycle management
- Dependency-aware staged deployment order
- Infrastructure testing and validation before deployment
Cost Optimization Techniques:
- Environment-specific resource sizing
- Single NAT Gateway option for dev environments
- Spot instances for non-production EKS workloads
- ECR lifecycle policies to perform image cleanup and minimize storage costs
The deployed application is a minimal NestJS API serving two endpoints:
| Endpoint | Description |
|---|---|
GET / |
Returns "Hello World!" |
GET /health |
Returns true, logs unique instance ID for diagnostics, used by health checks |
The application runs in Docker containers on port 3000, using pnpm as the package manager. A unique instance ID is generated on startup to demonstrate load balancing across multiple containers.
Docker Image: Built from Dockerfile using Node.js Alpine, tagged with ${environment}-${git-sha} format, and stored in AWS ECR with automated lifecycle management.
This repository provides two complete, independent implementations of the same application using different container orchestration strategies.
Location: infra-ecs/
AWS Elastic Container Service (ECS) provides a simpler, AWS-native container orchestration platform with managed EC2 instances.
Key Characteristics:
- β Lower complexity - Fewer concepts to learn
- β AWS-native - Deep integration with AWS services
- β Lower cost - No control plane charges
- β Faster deployment - Simpler architecture, quicker to provision
β οΈ AWS-specific - Not portable to other clouds
Architecture Highlights:
- ECS Cluster with Auto Scaling Group (EC2-based)
- Capacity Provider for automatic scaling
- Task placement strategies (binpack for dev, spread for prod)
- Direct ALB creation via Terraform
Best For:
- Learning AWS container services
- Production workloads staying on AWS
- Cost-sensitive projects
- Teams without Kubernetes expertise
π Full ECS Documentation β
Location: infra-eks/
AWS Elastic Kubernetes Service (EKS) provides a managed Kubernetes control plane with full Kubernetes API compatibility.
Key Characteristics:
- β Cloud-agnostic - Portable across cloud providers
- β Industry standard - Kubernetes skills are transferable
- β Rich ecosystem - Access to Kubernetes tooling and operators
- β Advanced features - StatefulSets, DaemonSets, CRDs, operators
β οΈ Higher complexity - Steeper learning curveβ οΈ Higher cost - Control plane has some associated costs
Architecture Highlights:
- Managed EKS control plane across multiple AZs
- Managed node groups with Auto Scaling
- AWS Load Balancer Controller (Helm chart with IRSA)
- Kubernetes resources managed via Terraform provider
- Horizontal Pod Autoscaler (HPA) for application scaling
Best For:
- Learning Kubernetes on AWS
- Multi-cloud or hybrid deployments
- Teams with Kubernetes expertise
- Complex microservices architectures (3+ services)
- Organizations avoiding vendor lock-in
π Full EKS Documentation β
Not sure which approach to use? Both implementations deploy the same application but use different orchestration strategies with distinct trade-offs.
Quick Decision Guide:
- Choose ECS if you're committed to AWS, want lower costs, and prefer simpler AWS-native services
- Choose EKS if you need multi-cloud portability, Kubernetes skills, or complex microservices architectures
π Complete EKS vs ECS Comparison β - Detailed comparison covering orchestration differences, networking, scaling, IAM, component mappings, and cost analysis.
Ready to get started? Choose the path that matches your goal:
Best for: Quickly getting a production-ready AWS infrastructure running with minimal Kubernetes complexity.
Steps:
- Review Prerequisites: Check ECS Prerequisites
- Configure Your Settings: Follow Required Configuration Changes
- Deploy Using CI/CD: Use GitHub Actions Workflows for automated deployment
- Verify Deployment: Access your application at
https://yourdomain.com
Estimated Time: 30-45 minutes
Best for: Learning Kubernetes on AWS or preparing for multi-cloud/portable deployments.
Steps:
- Review Prerequisites: Check EKS Prerequisites
- Configure Your Settings: Follow Required Configuration Changes
- Deploy Using CI/CD: Use GitHub Actions Workflows for automated deployment
- Verify Deployment: Configure kubectl and access your application at
https://yourdomain.com
Estimated Time: 45-60 minutes
Best for: Learning AWS-native container orchestration without Kubernetes complexity.
Learning Journey:
- Start with Overview: Read ECS High-Level Overview
- Module Architecture: Understand Root vs Child Modules
- Deep Dive into Components:
- Environment Configuration: Study dev vs prod Differences
- Explore Testing: Review Terraform Testing
Key Concepts: ECS Task Definitions, Capacity Providers, awsvpc networking, task placement strategies
Best for: Learning Kubernetes on AWS and cloud-agnostic container orchestration.
Learning Journey:
- Start with Overview: Read EKS High-Level Overview
- Module Architecture: Understand Root vs Child Modules
- Deep Dive into Components:
- Environment Configuration: Study dev vs prod Differences
- Explore Testing: Review Terraform Testing
- Compare Approaches: Read EKS vs ECS Comparison
Key Concepts: Kubernetes Deployments, HPA, IRSA, AWS Load Balancer Controller, Managed Node Groups
This project uses pre-commit hooks to enforce code quality, security, and formatting standards before commits and pushes to the remote repository. The hooks automatically format Terraform files, validate syntax, check for security issues, and prevent secret commits.
- Python 3.7+ (for pre-commit framework)
- Terraform 1.0+
- Homebrew (macOS/Linux)
# 1. Install pre-commit
brew install pre-commit
# 2. Install TFLint
brew install tflint
tflint --init # Install TFLint AWS plugin
# 3. Install Trivy (security scanner)
brew install trivy
# 4. Install detect-secrets
brew install detect-secrets
# 5. Install terraform-docs
brew install terraform-docs
# 6. Install the pre-commit hooks
pre-commit install # To allow running hooks that exec before each commit
pre-commit install --hook-type pre-push # To allow running hooks that exec before pushing code
# 7. (Optional) Test hooks against all files
pre-commit run --all-filesHooks run automatically at two stages:
Run automatically on git commit:
terraform_fmt- Format Terraform filesterraform_validate- Validate Terraform syntaxterraform_tflint- Lint Terraform codeterraform_trivy- Security vulnerability scanningterraform_docs- Generate module documentationdetect-secrets- Prevent commits including secrets
If hooks fail due to formatting:
# Files are auto-formatted but not staged
# Review changes, then:
git add .
git commit -m "Your message"Recommended workflow (avoid commit rejection):
# 1. Format before staging
terraform fmt -recursive infra-ecs/
terraform fmt -recursive infra-eks/
# 2. Stage and commit (hooks pass on first attempt)
git add .
git commit -m "Your message"To skip pre-commit hooks (
git commit --no-verifyRun automatically on git push when Terraform files have changed:
terraform-ecs-tests- Runs all ECS module tests if changes detected ininfra-ecs/terraform-eks-tests- Runs all EKS module tests if changes detected ininfra-eks/
These hooks validate your Terraform modules before pushing to the remote repository. Tests can take several minutes to complete.
To skip pre-push hooks (
git push --no-verify- AWS Account with appropriate IAM permissions
- Domain name (required for SSL certificates)
- Terraform 1.0+ installed
- AWS CLI configured (
aws configure) - kubectl (for EKS only)
- Helm (for EKS only)
cd infra-ecs/
# 1. Configure your settings
cd deployment/
# See infra-ecs/README.md for detailed steps
# 2. Deploy ECS infrastructure by committing changes
# See infra-ecs/README.md for detailed steps
# 3. Access your application
# https://yourdomain.comπ Detailed ECS Quick Start β
cd infra-eks/
# 1. Configure your settings
cd deployment/
# See infra-eks/README.md for detailed steps
# 2. Deploy EKS infrastructure by committing changes
# See infra-eks/README.md for detailed steps
# 3. Access your application
# https://yourdomain.comπ Detailed EKS Quick Start β
Both implementations include complete GitHub Actions workflows:
- Configure AWS credentials as GitHub secrets
- Update configuration files with your values
- Copy and paste either ECS or EKS workflow files into your
.github/workflows/directory - Trigger workflows via GitHub UI or
git push
π ECS Workflows: infra-ecs/docs/CICD_WORKFLOWS.md
π EKS Workflows: infra-eks/docs/CICD_WORKFLOWS.md
.
βββ infra-ecs/ # ECS implementation (simpler, AWS-native)
β βββ deployment/ # Root modules (orchestration)
β β βββ backend/ # S3 state bucket
β β βββ hosted_zone/ # Route53 DNS
β β βββ ssl/ # ACM certificate
β β βββ ecr/ # Container registry
β β βββ app/ # ECS-specific infrastructure
β β βββ vpc/ # Network
β β βββ ecs_cluster/ # ECS cluster + ASG
β β βββ alb/ # Load balancer
β β βββ ecs_service/ # Container service
β β βββ routing/ # DNS records
β βββ modules/ # Child modules (reusable)
β βββ tests/ # Terraform tests
β βββ docs/ # Additional documentation
β βββ README.md # Complete ECS documentation
β
βββ infra-eks/ # EKS implementation (Kubernetes-based)
β βββ deployment/ # Root modules (orchestration)
β β βββ backend/ # S3 state bucket
β β βββ hosted_zone/ # Route53 DNS
β β βββ ssl/ # ACM certificate
β β βββ ecr/ # Container registry
β β βββ app/ # EKS-specific infrastructure
β β βββ vpc/ # Network (with EKS tags)
β β βββ eks_cluster/ # EKS control plane
β β βββ eks_node_group/# Worker nodes
β β βββ aws_lb_controller/ # Ingress controller
β β βββ k8s_app/ # Kubernetes resources
β β βββ routing/ # DNS records
β βββ modules/ # Child modules (reusable)
β βββ tests/ # Terraform tests
β βββ docs/ # Additional documentation
β βββ README.md # Complete EKS documentation
β
βββ .github/workflows/ # GitHub Actions CI/CD
β βββ ecs/ # ECS workflows
β βββ eks/ # EKS workflows
β
βββ src/ # NestJS application
βββ Dockerfile # Container image definition
βββ .pre-commit-config.yaml # Pre-commit hooks config
βββ README.md # This file
This project does not implement DynamoDB state locking for Terraform remote state.
What this means:
- Single developer: Safe to use as-is
- Team collaboration: Risk of state corruption from concurrent operations
- Production teams: Should implement state locking
Why it matters:
- Concurrent
terraform applyoperations can corrupt the state file - Multiple developers/pipelines can create race conditions
- Changes may be overwritten or lost
To enable state locking:
- Create a DynamoDB table with
LockIDas the primary key (hash key) - Uncomment the
dynamodb_tableparameter in allbackend.tffiles - Update with your DynamoDB table name
- Ensure team members have DynamoDB permissions
Current state: All backend.tf files have dynamodb_table commented out for simplicity in learning environments.
Recommendation:
- Learning/Solo: Can safely omit for reduced complexity
- Production/Team: Always enable state locking
For a complete list of known issues, limitations, and planned improvements, please refer to the Issues section of this GitHub repository. This includes:
- Bug reports and fixes
- Feature requests and enhancements
- Documentation improvements
- Infrastructure optimization opportunities
If you encounter any issues not listed there, please open a new issue with detailed information about the problem.
This is a learning project, but contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Ensure all pre-commit hooks pass
- Submit a pull request with a clear description
This project is open source and available under the MIT License.
- ECS-specific questions: See
infra-ecs/README.md - EKS-specific questions: See
infra-eks/README.md - General issues: Open an issue on GitHub
Happy Learning! π