Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ Spirit periodically saves the progress of a schema change to an internal checkpo

When you consider that many migrations are best measured in _days_, this feature can save you a lot of lost work and improves the predictability of large-table schema migrations.

> **⚠️ Resume across Spirit binary versions is not supported.** A migration must be resumed by the same Spirit binary version that wrote the checkpoint. See [checkpoint-max-age](docs/migrate.md#checkpoint-max-age) for details on what Spirit does (and does not) detect when a different version is used to resume.

**Note:** [This feature](https://github.com/github/gh-ost/blob/master/doc/resume.md) is now available in gh-ost.

## Atomic Multi-table changes
Expand Down
17 changes: 17 additions & 0 deletions docs/migrate.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,21 @@ The maximum age of a checkpoint before Spirit refuses to resume from it. When Sp

This protects against resuming from very stale checkpoints where replaying the accumulated binary log changes would take longer than starting the migration from scratch.

#### Resuming across Spirit binary versions

> **⚠️ Resuming a migration with a different Spirit binary version than the one that wrote the checkpoint is not supported and may produce incorrect results.**

When Spirit reads a checkpoint, it relies on the columns of the checkpoint table matching the columns the current binary expects:

- **If the checkpoint table schema differs** between versions (columns added, removed, or reordered), the resume read will fail and Spirit logs a warning and starts a fresh migration. This is true in both strict and non-strict modes — [`--strict`](#strict) does not currently hard-fail on a checkpoint-table schema mismatch, so progress from the previous binary version is silently discarded.
- **If the checkpoint table schema is unchanged but the *meaning* of stored values has changed** between versions (for example, a watermark format change, a routing-policy change, or a new applier behavior), Spirit cannot detect the mismatch. The resume will silently succeed and the new binary will reinterpret the old checkpoint, which can produce incorrect results.

Operationally, this means:

- Do not upgrade or downgrade the Spirit binary while a migration is in progress.
- If you must change Spirit versions, let the in-flight migration finish first, or accept the lost progress and start fresh with the new version.
- For long-running migrations that span planned binary upgrades, plan to drain the migration before the upgrade window.

### checksum-yield-timeout

- Type: Duration
Expand Down Expand Up @@ -286,6 +301,8 @@ The scenarios where `--strict` causes Spirit to fail rather than restart are:

In all of these cases, the default (non-strict) behavior is to log a warning and start fresh, which is usually the correct action.

Note: a checkpoint-table schema mismatch (typically caused by resuming with a different Spirit binary version — see [Resuming across Spirit binary versions](#resuming-across-spirit-binary-versions)) is **not** one of the strict-mode hard-fail cases. In both strict and non-strict mode Spirit logs a warning and starts a fresh migration.

### table

- Type: String
Expand Down
Loading