Skip to content

Ephemeral / auto-destroy workspaces (scheduled + inactivity TTL) #345

@mattrobinsonsre

Description

@mattrobinsonsre

Summary

Add ephemeral / auto-destroy workspaces: scheduled destroy at a set time, and/or auto-destroy after a period of state inactivity (TTL). TFE/HCP Terraform offers both (paid tiers); it's a top cost-control feature for short-lived/test environments. The machinery to do this safely already exists in Terrapod.

TFE/HCP parity reference

  • Scheduled destroy: one-shot destroy at a specific date/time.
  • Auto-destroy on inactivity: destroy after N hours/days without a state change; reschedules on each apply.
  • Reminder + result notifications.

Proposed design (Terrapod-native, reuses existing machinery)

  • DB (Workspace): auto_destroy_at (TIMESTAMPTZ, nullable — one-shot), auto_destroy_inactivity (interval/seconds, nullable — TTL since last state change), auto_destroy_status. Both null = disabled (default; safe).
  • Trigger: a new distributed-scheduler periodic task (scheduler.py, multi-replica safe, no leader election) — auto_destroy_check — finds workspaces where now >= auto_destroy_at OR now - last_state_change >= auto_destroy_inactivity, and queues an is_destroy + auto_apply run with a dedicated source = "auto-destroy".
  • Completion: reuse the existing Autodiscovery Workspace Lifecycle Management #314 pattern — the opt-in-destroy → archive hook in run_service.transition_run is keyed on a literal run.source; add "auto-destroy" so a successful destroy archives (not deletes) the workspace. Inactivity timer resets naturally on the next applied state version.
  • Safe by default (mirrors Autodiscovery Workspace Lifecycle Management #314): disabled unless explicitly set; never destroys with unmanaged drift-only changes; dedupe so only one destroy is queued/in-flight per workspace (same guard pattern as Autodiscovery Workspace Lifecycle Management #314 reconcile_branch_advance); never act if a run is already active on the workspace.
  • Notifications: reuse the notification dispatcher — emit reminder (configurable lead time) + result events; add run:auto_destroy_* triggers to the existing trigger set.
  • API/UI: workspace settings fields + the bulk-update endpoint (Feature Request: Batch Workspace Updates and Expanded Workspace Defaults #318) so a fleet of ephemeral envs can be configured at once; surface a "destroys in …" banner + a list badge.

Scope / non-goals

Acceptance criteria

  • Scheduled and inactivity modes each queue exactly one is_destroy auto-apply run at the right time; dedupe holds across poll cycles and replicas.
  • Successful auto-destroy archives the workspace (state/history retained); inactivity timer resets on new applies.
  • Disabled by default; reminder + result notifications fire.
  • Tests (timer logic, dedupe, archive hook, multi-replica) + docs + runbook ("workspace auto-destroyed unexpectedly / recover an archived ephemeral workspace").

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions