This project is a more polished version of the cloud-engineering-project that we built during the MLOps Club cohort of Taking Python to Production on AWS by Eric Riddoch and me.
In this project, we built ––
- A RESTful API using FastAPI to do CRUD operations against an S3 bucket.
- Implemented principles of 12-factor app and RESTful API design.
- Dockerized the application for easy local development and deployment.
- Rigorously tested it using pytest with mocks for AWS S3 and OpenAI services.
- Setup load testing with Locust
- We wrote API Contract using OpenAPI spec, auto-generated from FastAPI code, with pre-commit hooks using
oasdiffto catch breaking changes and OpenAPI Generator to create a Python client SDK for the API. - Serverless Deployment: Deployed the app using AWS CDK with Docker on AWS Lambda and exposed it via API Gateway.
NOTE: After deploying, make sure to update the secret in Secrets Manager with your OpenAI API key.
- Using AWS Secrets Manager to store OpenAI API keys and
- Using
AWS-Parameters-and-Secrets-Lambda-Extensionto securely fetch it inside the Lambda function.
- CI/CD Pipeline: Automated testing and deployment using GitHub Actions.
- Observability & Monitoring:
- Setup in-depth logging on AWS CloudWatch using loguru.
- Implemented tracing with AWS X-Ray, both correlating logs and traces using trace-IDs.
- Custom Metrics with AWS CloudWatch Metrics using
aws-embedded-metrics.
- Added Secret Manager to store OpenAI API key securely.
- Used AWS SSM Parameter Store to store the OpenAI API key instead of Secrets Manager. ref
- This is free to use and has no additional cost unlike Secrets Manager $0.40 per secret per month.
- Setup the Dockerfile with the recommended way of using uv in Docker.
- Implement API versioning strategy (like v1 in the path).
- Implement authentication (API keys or AWS Cognito) and secure Swagger UI page and possiblly the API endpoints as well.
- Add rate limiting to the API using API Gateway
- Setup CI/CD pipeline to deploy the API to AWS using GitHub Actions.
- Implement multi-environment deployment pipeline (dev/prod) with approval gates
- Observability & Monitoring improvements:
- Use OpenTelemetry for tracing instead of AWS X-Ray, ref.
- Setup Grafana dashboards with CloudWatch data sources for enhanced monitoring
- Replace Cloudwatch with Grafana Stack -- logs, metrics and traces
- Container Orchestration:
- Containerize the app and deploy it using Application Load Balancer
- ECS Fargate (Serverless)
- Amazon EC2
- Setup auto-scaling based on request load or CPU/memory usage
- Deploy on Kubernetes EKS (Kubernetes)
- Containerize the app and deploy it using Application Load Balancer
- Add custom domain with Route53 and ACM for HTTPS (
https://api.myapp.com/v1/)
Install uv, aws-cli v2 and node.js before running.
-
Clone the repo
git clone https://github.com/avr2002/files-api.git
-
Install dependencies
uv sync --all-groups
-
If you are using AWS SSO, you can activate your profile by running the following command:
# AWS_PROFILE=sandbox aws configure sso --profile sandbox # OR aws sso login --profile sandbox
-
Setup Infra
# Bootstrap the CDK environment ./run cdk-bootstrap # Deploy the CDK stack # This will create the S3 bucket and SageMaker domain ./run cdk-deploy
-
Create a
.openai.envfile with your OpenAI API keyexport OPENAI_API_KEY=your_openai_api_key_here -
Clean Up
# Destroy infra ./run cdk-destroy # Clean up local files ./run clean
-
Start the API locally with mock S3 and OpenAI services
./run run-mock
-
Run locally with Docker
./run run-docker
^^^ If you want to use real AWS credentials and OpenAI Service, modify the docker-compose.yaml file
-
Run with Locust for load testing
./run run-locust
^^^ If you want to load test against deployed API, modify the docker-compose.locust.yaml file with the deployed API Gateway URL
-
Generate OpenAPI spec
uv run scripts/generate-openapi.py generate --output-spec=openapi.json
-
Generate the client library from OpenAPI spec
./run generate-client-library
-
Run tests with coverage
./run run-tests # Serve the HTML coverage report ./run serve-coverage-report
-
Lint the codebase
./run lint # OR ./run lint:ci
Locust Dashboard showing Deployed API performance under load
X-Ray Trace with Correlated Logs for a successful PUT Request
X-Ray Trace with Correlated Logs for a failed POST Request
API Gateway 4XX and 5XX Errors Metric
Similary, you can view other metrics like Lambda Invocations, Duration, Errors, Throttles etc. and for S3 as well.
Contributions are welcome!
- Complete the setup process above
- Make your changes
- Run tests and linting, and
- Submit a pull request.




