Skip to content

Migration intelligence — proposes SQL for spec/entity diffs with safety checks #80

@tonydspaniard

Description

@tonydspaniard

Goal

Build migration intelligence: a tool that takes a spec diff (or an entity-class diff) and proposes the SQL needed to safely evolve the database — column adds, renames, drops, type changes, foreign key adjustments — including data-migration where required. Surfaced via bin/altair db:migration-plan and the framework__plan_migration MCP tool.

Today migrations are emitted as skeletons. The agent has to fill in the data-shape changes by hand. Migration intelligence fills them in correctly, with awareness of safety pitfalls (NOT NULL on existing rows, FK orphaning, etc.).

Why

Spec-driven scaffolding is great for greenfield endpoints. Real systems do refactors:

  • "Rename User.password to User.password_hash"
  • "Split User.full_name into first_name + last_name"
  • "Make email unique (turns out it wasn't before)"
  • "Drop legacy_role, but keep its data in a sidecar table for 90 days"

Each is a non-trivial migration. Without intelligence, the agent writes "ALTER TABLE users ADD COLUMN ..." and hopes. With it, the agent gets a proposal that includes:

  1. The DDL operations in the right order
  2. Backfill SQL for data migrations
  3. Rollback steps
  4. Safety warnings (e.g. "This NOT NULL change will fail on 23 existing rows — backfill required first")

How it works

  1. Compute the diff — either from two spec files (api/users/create.yaml@v1 vs @v2), or from current spec vs. current schema, or from a given Entity class vs. its current DB shape
  2. Walk the diff producing intentsadd_column, rename_column, drop_column, change_type, add_index, add_foreign_key, migrate_data
  3. Plan the SQL — per intent, emit SQL appropriate to the target driver (Postgres / MySQL / SQLite via Cycle's dialect layer)
  4. Run safety checks — query the live DB (read-only) to verify the migration won't fail. E.g. for ALTER COLUMN ... SET NOT NULL, count rows where the column is null; warn if non-zero.
  5. Output the proposal — as a Cycle migration class, ready for review and db:migrate

CLI surface

bin/altair db:migration-plan                                    # detect drift between specs and DB
bin/altair db:migration-plan api/users.yaml                    # for one spec file
bin/altair db:migration-plan --from-entity=App\\User\\User    # entity-vs-schema diff
bin/altair db:migration-plan --output=database/migrations/    # write the proposed migration
bin/altair db:migration-plan --dry-run                          # print, don't write
bin/altair db:migration-plan --format=json                      # for MCP

Output (pretty mode):

Proposed migration: add_display_name_to_users.php

Operations:
  → ADD COLUMN display_name VARCHAR(255) NULL DEFAULT NULL
  → UPDATE users SET display_name = COALESCE(first_name || ' ' || last_name, email) WHERE display_name IS NULL
  → ALTER COLUMN display_name SET NOT NULL
  ✓ Safe: backfill query would affect 247 rows; estimated <1s on production-sized table.

Rollback:
  → DROP COLUMN display_name

Output (JSON mode for MCP):

{
  "migration_name": "add_display_name_to_users",
  "filename": "database/migrations/2026_05_27_114523_add_display_name_to_users.php",
  "operations": [
    { "op": "add_column", "table": "users", "column": "display_name", "type": "varchar(255)", "nullable": true },
    { "op": "data_migration", "sql": "UPDATE users SET ..." },
    { "op": "alter_column", "table": "users", "column": "display_name", "nullable": false }
  ],
  "rollback": [
    { "op": "drop_column", "table": "users", "column": "display_name" }
  ],
  "safety": {
    "issues": [],
    "warnings": [
      { "level": "warn", "message": "Backfill will affect 247 rows" }
    ],
    "estimated_duration_ms": 800
  }
}

Safety checks performed

Change Check Severity
Add NOT NULL without default Count rows where new column would be NULL (post-rename, etc.) — if > 0, propose backfill error
Add UNIQUE Count duplicates — if > 0, propose dedup strategy or refuse error
Drop column Warn unconditionally; require --force if column has non-null values warn
Rename column Recommend two-phase: add new column, copy data, deploy, drop old column info
Add foreign key Count orphan rows in the foreign-key column — if > 0, refuse error
Change column type Sample 100 rows, try the cast in a transaction (rolled back); warn on cast failures warn or error
Drop index Warn if the index is used in queries from pg_stat_user_indexes / equivalent info
Large table operation (>1M rows) Warn that this should run during a maintenance window or with CONCURRENTLY (Postgres) warn

The checks query the dev DB read-only. On staging/prod the agent shouldn't run this directly — instead generate the plan locally and apply via the team's deploy process.

MCP tool

{
  "name": "framework__plan_migration",
  "description": "Compute a migration plan from a spec change or entity diff",
  "inputSchema": {
    "from": "string (spec path or entity class)",
    "to": "string (spec path or entity class, optional — defaults to current state)",
    "safety_check": "boolean (default true)"
  }
}

The agent calls this before committing to a refactor. Sees the proposed operations + safety. Decides whether to proceed.

Two-phase migrations

For renames and incompatible type changes, the tool produces two migrations:

  1. 2026_05_27_114523_phase1_add_new_column_for_rename.php — additive, deploys safely
  2. 2026_05_27_114600_phase2_drop_old_column_after_rename.php — destructive, deploy after phase 1 is verified in prod

The plan output explicitly labels them and the safety section warns that phase 2 should not run in the same release.

Shape

src/Altair/MigrationIntelligence/
├── Cli/
│   └── PlanCommand.php
├── Mcp/
│   └── PlanMigrationTool.php
├── Diff/
│   ├── SpecDiffer.php           # YAML spec vs YAML spec (or vs DB schema)
│   ├── EntityDiffer.php         # PHP entity vs DB schema
│   └── SchemaIntrospector.php   # uses Cycle's schema reader
├── Intent/
│   ├── AddColumnIntent.php
│   ├── DropColumnIntent.php
│   ├── RenameColumnIntent.php
│   ├── ChangeTypeIntent.php
│   ├── AddIndexIntent.php
│   ├── AddForeignKeyIntent.php
│   └── DataMigrationIntent.php
├── Planner/
│   ├── PostgresPlanner.php
│   ├── MySqlPlanner.php
│   └── SqlitePlanner.php
├── Safety/
│   ├── NotNullSafetyCheck.php
│   ├── UniqueSafetyCheck.php
│   ├── ForeignKeySafetyCheck.php
│   ├── TypeCastSafetyCheck.php
│   └── LargeTableSafetyCheck.php
├── Emitter/
│   └── CycleMigrationEmitter.php  # produces the Cycle migration PHP class
└── composer.json

Acceptance criteria

  • db:migration-plan produces a correct Cycle migration for: add column, drop column, rename column, add index, add unique constraint, add foreign key, change type
  • Two-phase migrations emitted for rename + incompatible type changes
  • Safety checks query the dev DB read-only, return accurate row counts
  • Rollback section generated for every intent that supports it
  • Dialect coverage: Postgres + MySQL + SQLite produce correct DDL for the same intent
  • framework__plan_migration MCP tool returns the structured plan
  • Spec-driven path: edit a spec, run db:migration-plan, get a migration that brings DB in line with new spec
  • Entity-driven path: edit an entity class, run with --from-entity, get a migration
  • Tests:
    • Per-intent unit test: intent + dialect → expected SQL
    • Safety checks: fixture DBs with rows that should/shouldn't trigger warnings
    • End-to-end: snapshot test from users.yaml@v1 + users.yaml@v2 → expected migration file content (deterministic)
    • Cross-dialect: same intent produces correct SQL on all three drivers

Out of scope

  • Data migrations beyond simple SQL UPDATE (complex business-logic data transforms are still hand-written)
  • Online schema changes (pt-online-schema-change, gh-ost) integration
  • Multi-tenant DBs / sharding
  • NoSQL migration intelligence (Mongo / DynamoDB) — separate concern
  • Generating tests for the migration (the agent can run the test suite post-migration; we don't need to scaffold migration-specific tests)

Dependencies

No new external deps — leverages Cycle's existing schema builder and migration libraries.

Why this matters

Migrations are the deploy-time disaster vector. An agent that writes a migration without intelligence will eventually generate ALTER COLUMN SET NOT NULL on a column with 50% nulls and bring down production. With intelligence, the same agent gets warned and produces the correct two-phase plan. The single feature that makes "agent-written migrations" production-safe.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions