Skip to content

KRC00112/Taskflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Taskflow

deploy-to-ec2

A distributed task processing system built with a microservices architecture. This project Demonstrates asynchronous job processing, message queuing, containerization, cloud deployment, and CI/CD automation.

What it is

Taskflow is a two-service backend system that separates HTTP request handling from background processing. When a task is created via the API, it is published to a message queue and processed asynchronously by a worker service. This mirrors real-world patterns used in order processing, notification, and job queue systems.

Architecture

HTTP Client
     │
     ▼
API Service (Node.js)
     │
     ├──── PostgreSQL (stores tasks + status)
     │
     └──── RabbitMQ (publishes job)
                │
                ▼
         Worker Service (Node.js)
                │
                └──── PostgreSQL (updates status: pending → processing → done)

All services run as Docker containers on AWS EC2, provisioned with Terraform and deployed automatically via GitHub Actions.

Tech Stack

Layer Technology
API Service Node.js, Express
Worker Service Node.js
Message Queue RabbitMQ (quorum queues)
Database PostgreSQL
Logging Winston (structured JSON logs)
Containerization Docker, Docker Compose
Infrastructure AWS EC2, Terraform
CI/CD GitHub Actions
Observability CloudWatch, /metrics endpoint, structured logs

API Endpoints

Method Endpoint Description
POST / Create a task
GET / Get all tasks
GET /:id Get a task by ID
GET /metrics Get task counts by status

Creating a task with Postman:

Send a POST request to http://<PUBLIC_IP>:3000 with the following JSON body:

{
  "title": "my task"
}

Response:

{
  "id": 1,
  "title": "my task",
  "status": "pending",
  "created_at": "2026-04-27T10:00:00.000Z",
  "updated_at": "2026-04-27T10:00:00.000Z"
}

The worker picks up the job and updates the status: pending → processing (wait 10 secs) → done.

Demo Video

demo.mp4

CI/CD Pipeline

Every push to main triggers a GitHub Actions workflow that:

  1. SSHs into the EC2 instance
  2. Pulls the latest code from GitHub
  3. Rebuilds Docker images
  4. Restarts the stack

Secrets stored in GitHub: EC2_HOST, EC2_USER, EC2_SSH_KEY.

Observability

  • Structured logs — Winston emits JSON logs with task_id, title, duration_ms, and error fields
  • Metrics endpointGET /metrics returns live counts of tasks by status
  • CloudWatch — both containers ship logs to the taskflow log group in AWS CloudWatch via the awslogs Docker driver

Architecture Diagram

Infrastructure Diagram

Deploy Your Own

This section walks you through deploying Taskflow to your own AWS account from scratch. You will need: an AWS account, Terraform installed, the AWS CLI installed and configured, and an SSH key pair.

1. Fork the Repository

Fork this repository to your own GitHub account before doing anything else. All subsequent steps assume you are working from your fork. This matters for two reasons: your GitHub secrets and CI/CD pipeline need to point to a repo you own, and Terraform will clone your fork onto the EC2 instance so that future deployments pull from the right place.

2. Prerequisites

Install and configure the AWS CLI:

aws configure

You will be prompted for your AWS Access Key ID, Secret Access Key, default region (e.g. us-east-1), and output format (json). You can generate access keys from the AWS Console under your account name → Security Credentials → Access Keys.

Generate an SSH key pair if you don't have one already:

ssh-keygen -t rsa -b 4096 -f "$env:USERPROFILE\.ssh\taskflow-key"

Press enter twice for no passphrase. This creates taskflow-key (private) and taskflow-key.pub (public) in ~/.ssh/.

3. Provision Infrastructure with Terraform

Clone your forked repository and navigate to the terraform folder:

git clone https://github.com/<YOUR_GITHUB_USERNAME>/Taskflow.git
cd Taskflow/terraform

Initialize Terraform and apply:

terraform init
terraform apply -var username=<YOUR_GITHUB_USERNAME>

Terraform will create:

  • An EC2 t3.micro instance running Ubuntu 24.04
  • A security group opening ports 22 (SSH), 3000 (API), and 15672 (RabbitMQ management UI)
  • An SSH key pair using your ~/.ssh/taskflow-key.pub
  • Docker and Docker Compose installed on the instance automatically via user_data
  • The Taskflow repo cloned automatically on the instance

When terraform apply completes it will print the public IP of your instance:

Outputs:
public_ip = "x.x.x.x"

Save this IP. You will need it in the next steps.

4. Attach CloudWatch IAM Role

The containers ship logs to CloudWatch via the awslogs Docker driver. For this to work the EC2 instance needs permission to write to CloudWatch. Without this the stack will fail to start.

Create the IAM role:

Go to AWS Console → IAM → Roles → Create role:

  • Trusted entity: AWS service → EC2
  • Permissions: search for and add CloudWatchAgentServerPolicy
  • Name it taskflow-cloudwatch-role
  • Click Create role

Attach it to your instance:

Go to AWS Console → EC2 → select your instance → Actions → Security → Modify IAM role → select taskflow-cloudwatch-role → Update IAM role.

Create the CloudWatch log group:

aws logs create-log-group --log-group-name taskflow --region <YOUR_REGION>

5. Configure Environment Variables on the Server

Wait 1-2 minutes after provisioning for the instance to finish booting, then SSH in:

ssh -i ~/.ssh/taskflow-key ubuntu@<YOUR_PUBLIC_IP>

The repo is already cloned at /home/ubuntu/Taskflow by the user_data script. Navigate into it:

cd /home/ubuntu/Taskflow

Create environment files for each service using tee (required since the folder is owned by root):

sudo tee api-service/.env << EOF
DB_USER=postgres
DB_HOST=postgres
DB_NAME=taskflowdb
DB_PASSWORD=yourpassword
DB_PORT=5432
EOF
sudo tee worker-service/.env << EOF
DB_USER=postgres
DB_HOST=postgres
DB_NAME=taskflowdb
DB_PASSWORD=yourpassword
DB_PORT=5432
EOF

Replace yourpassword with a password of your choice. Make sure DB_HOST is set to postgres (the Docker Compose service name), not localhost.

6. Start the Stack

From the repo root on the server:

docker-compose up --build -d

This starts RabbitMQ, PostgreSQL, the API service, and the worker service as containers. Verify everything is running:

docker-compose ps

7. Create the Database Table

docker-compose exec postgres psql -U postgres -d taskflowdb -c "CREATE TABLE tasks (id SERIAL PRIMARY KEY, title VARCHAR(255) NOT NULL, status VARCHAR(50) DEFAULT 'pending', created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW());"

Your API is now publicly reachable at http://<YOUR_PUBLIC_IP>:3000.

8. Set Up CI/CD

To enable automatic deployment on every push to main, add the following secrets to your forked GitHub repository under Settings → Secrets and variables → Actions:

Secret Value
EC2_HOST Your EC2 public IP
EC2_USER ubuntu
EC2_SSH_KEY Contents of your ~/.ssh/taskflow-key private key file

Once set, every push to main will SSH into your EC2 instance, pull the latest code, rebuild the images, and restart the stack automatically. You will not need to SSH in again to apply updates.

9. View the RabbitMQ Dashboard

Open http://<YOUR_PUBLIC_IP>:15672 in your browser. Log in with guest / guest. You can monitor queues, message rates, and connections here.

10. Tear Down

To destroy all AWS resources created by Terraform and stop incurring charges:

cd terraform
terraform destroy

Design Decisions

  • Separating API and Worker services

    The API can respond immediately without waiting for processing to complete. If the worker is slow or crashes, the API stays unaffected. This is the foundation of any resilient backend system.

  • Choosing RabbitMQ over direct HTTP calls between services

    Direct HTTP between services creates tight coupling. If the worker is down, the API fails too. RabbitMQ acts as a buffer: jobs queue up and are processed when the worker is ready. Quorum queues ensure no jobs are lost even if RabbitMQ restarts.

  • Docker Compose over Kubernetes

    Kubernetes adds significant operational overhead that isn't justified for a two-service system. Docker Compose keeps the deployment simple and reproducible while still demonstrating containerization and multi-service orchestration.

  • Terraform over manual AWS setup

    Infrastructure as code means the entire AWS setup can be recreated from scratch with one command. No clicking through consoles, no undocumented manual steps.

About

Taskflow is a distributed, containerized task-processing system that uses Node.js, RabbitMQ, and PostgreSQL to handle API requests asynchronously via a dedicated worker service, with automated deployment on AWS using Terraform and GitHub Actions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors