Skip to content

Add iac/dev/ for single-EC2 dev environment#71

Open
amalbet wants to merge 4 commits into
mainfrom
feature/dev-iac
Open

Add iac/dev/ for single-EC2 dev environment#71
amalbet wants to merge 4 commits into
mainfrom
feature/dev-iac

Conversation

@amalbet

@amalbet amalbet commented Apr 16, 2026

Copy link
Copy Markdown

Summary

Adds iac/dev/ — an isolated Terraform module for a single-EC2 dev environment running the full docker-compose stack via setup.sh. Intended as a low-risk starting point for AWS provisioning, separate from the existing ECS-based iac/ module.

What it provisions

  • VPC 10.100.0.0/16 with a public subnet in a single AZ (isolated from existing iac/ which uses 10.0.0.0/16)
  • EC2 t3.large Ubuntu 22.04, user-data bootstraps Docker + clones local-deployment + runs setup.sh --edges 2
  • Security group scoped to an allowed_ips variable — no 0.0.0.0/0 exposure on any port
  • IAM role for SSM Session Manager — no SSH key management required

Ports exposed to allowed_ips only: 4200 (UI), 8082 (B2B), 10016 (Odoo), 8086 (InfluxDB).

Design decisions

  • Single EC2 over ECS: ECS Fargate can't run OpenVPN (no NET_ADMIN) and has no local persistent storage for InfluxDB. Single EC2 with docker-compose matches the architecture doc's Phase 1 recommendation and reuses the proven setup.sh we've been testing locally.
  • SSM over SSH: No inbound port 22, no SSH key management. Session Manager via AWS CLI gives us shell access without exposing SSH to the internet.
  • Reuses existing state bucket: openems-deployment-tf-state-file with key iac/dev/terraform.tfstate — no new S3/DynamoDB resources.
  • IP allowlist as a variable: Add/remove IPs by editing terraform.tfvars + terraform apply (~10s, no instance disruption).
  • user_data in lifecycle.ignore_changes: Prevents accidental instance replacement when the bootstrap script is edited. Re-bootstrapping requires explicit destroy + recreate.

What's NOT in this PR

  • OpenVPN server (deferred — needs PKI setup)
  • TLS termination (defense in depth — deferred)
  • MBE deployment (separate service, separate ticket)
  • InfluxDB persistence across EC2 replacements (data lives on the root volume only for now)

Builds on the architecture doc in #70.

Test plan

  • terraform init succeeds against the existing state bucket
  • terraform plan shows a clean create for VPC + subnet + IGW + RT + SG + IAM + EC2
  • terraform apply completes without errors
  • After ~10 min, http://<public-ip>:4200 serves the OpenEMS UI
  • http://<public-ip>:8082/jsonrpc with Basic auth returns a valid response
  • aws ssm start-session --target <instance-id> opens a shell
  • Adding an IP to allowed_ips and re-applying only modifies the SG
  • terraform destroy cleans up all resources

🤖 Generated with Claude Code

Provisions an isolated dev environment on AWS: VPC 10.100.0.0/16 with a
public subnet, t3.large Ubuntu 22.04 instance running the full
docker-compose stack via setup.sh, security group scoped to an
allowed_ips variable (no 0.0.0.0/0 exposure), and an IAM role for SSM
Session Manager access (no SSH key required).

State is stored in the existing openems-deployment-tf-state-file S3
bucket under a separate key (iac/dev/terraform.tfstate) so it does not
collide with the production ECS deployment in iac/.

User-data clones the local-deployment branch and runs setup.sh --edges 2
on first boot. Bootstrap takes ~10 min on a fresh instance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Alejandro Malbet <amalbet@gmail.com>
@aidan-barnes-axm

Copy link
Copy Markdown
Contributor

I should note right off the bat that this commit is applied to the main branch which has had some incomplete terraform/dockerfiles for AWS deployment when I last examined it back in Nov/Dec 2025 - AI code pending further review. Only commits to main since then have been housekeeping for OpenSSF integration, not any merges from more functional branches. Key blocker to remedy main branch stagnation is migration to EIOT AWS and making terraform code etc more AWS-org-agnostic rather than current implementation.

Reuses existing state bucket: openems-deployment-tf-state-file with key iac/dev/terraform.tfstate — no new S3/DynamoDB resources.
AI inference artifact - S3 bucket and DynamoDB are backend components that are configured manually by admin/devops as part of initial terraform setup.

more comments to follow

Signed-off-by: Aidan Barnes <66229298+aidan-barnes-axm@users.noreply.github.com>
@aidan-barnes-axm

aidan-barnes-axm commented Apr 20, 2026

Copy link
Copy Markdown
Contributor
  • terraform init succeeds when configured manually with state bucket and dynamodb table
  • terraform plan appears to create an IGW, IAM role+instance profile,Security Group, EC2 Instance,Route Table,Subnet, and VPC
  • terraform apply does NOT succeed without errors - still some permissions to troubleshoot.
  • terraform destroy failed to clean up, another permissions error. I'll look at this in the morning.

@amalbet

amalbet commented Apr 21, 2026

Copy link
Copy Markdown
Author

Thanks for testing this Aidan — really helpful to have the init/plan/apply results.

Two things we need from you to update the code before next apply attempt:

  1. State bucket + lock table names — What did you name the S3 bucket and DynamoDB table you created in the dev account? We need to update iac/dev/backend.tf to match (currently references the old account's openems-deployment-tf-state-file).

  2. Tag value — From Zulip you mentioned your IAM policies use Environment=adev to scope access. Our Terraform defaults to Environment=dev in provider.tf. Which value should we align on? We'll update whichever side makes more sense.

Once we have those, we'll push fixes and the apply/destroy errors should clear up (the tag mismatch is likely what's blocking both).

@aidan-barnes-axm

aidan-barnes-axm commented Apr 21, 2026

Copy link
Copy Markdown
Contributor
1. **State bucket + lock table names** — What did you name the S3 bucket and DynamoDB table you created in the dev account? We need to update `iac/dev/backend.tf` to match (currently references the old account's `openems-deployment-tf-state-file`).

I have sent you the state bucket and DynamoDB lock table names on Zulip already, wasn't sure if it's a good idea to publish backend configuration metadata on a public github repo.

2. **Tag value** — From Zulip you mentioned your IAM policies use `Environment=aidev` to scope access. Our Terraform defaults to `Environment=dev` in `provider.tf`. Which value should we align on? We'll update whichever side makes more sense.

c492c39 I didn't attach it to this PR but I did update provider.tf to tag Environment = aidev as I feel it's thematically appropriate given the objective to enable agentic AI involvement.

✔can verify that terraform apply and destroy completes without errors now and have updated test plan checklist accordingly- it was just a few missing permissions in the SCP that I have documented in PR #72 via commits.

@amalbet

amalbet commented Apr 21, 2026

Copy link
Copy Markdown
Author

Thanks, I missed it. Got it now.

- backend.tf: point to Aidan's state bucket
  (docker-openems-feature-dev-iac) and lock table
  (docker-openems-feature-dev-iac-state-lock) in the dev account
- provider.tf: change Environment tag from "dev" to "aidev" to match
  the IAM policies Aidan configured for the dev account

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Alejandro Malbet <amalbet@gmail.com>
S3 bucket and DynamoDB table names are now provided at init time
via backend.tfvars (gitignored), not hardcoded in backend.tf.
This keeps infrastructure metadata out of the public repo.

Usage: terraform init -backend-config=backend.tfvars

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Alejandro Malbet <amalbet@gmail.com>
amalbet added a commit that referenced this pull request May 18, 2026
Captures the explicit PR #71 scope decision (UI/Odoo/B2B/WS over plain
HTTP, IP-allowlisted) so it's not silently normalized, and opens the
team discussion for HTTPS options (self-signed, Caddy+LE, ALB+ACM)
before the dev env becomes persistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Alejandro Malbet <amalbet@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants