Skip to content

Commit b2c0977

Browse files
fix(deploy): shrink entrypoint to bare-minimum boot prep
The entrypoint's reconcile state machine moved into the API process (prior commit). What's left is the bits that *must* run before the Node process exists: trust the PVC, set a pseudonymous git identity for rebase committer lines, and ensure a `.git` exists by doing a full-history clone on first boot. About 190 lines of shell → ~75 (most of it now comments). The reconciliation, conflict-escape-hatch, and fetch-failure handling are all the API's job now. Also updates docs/operations/deploy.md "Boot sequence" to reflect the split — entrypoint clones, API reconciles. Refs #66. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 1b88659 commit b2c0977

2 files changed

Lines changed: 88 additions & 164 deletions

File tree

deploy/docker/entrypoint.sh

Lines changed: 45 additions & 148 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,29 @@
11
#!/bin/sh
22
# CodeForPhilly API entrypoint.
33
#
4-
# On pod start:
5-
# 1. Ensures a workable clone of CFP_DATA_REMOTE exists at CFP_DATA_REPO_PATH.
6-
# 2. Reconciles local commits (made by the previous pod's runtime that the
7-
# push daemon hadn't yet pushed) with origin:
8-
# - in sync → no-op
9-
# - behind → fast-forward
10-
# - ahead → push pending commits to origin
11-
# - diverged + clean rebase → rebase + push
12-
# - diverged + conflicts → push a `conflicts/<UTC-timestamp>` branch
13-
# to origin for operator review, then hard-reset local to origin so
14-
# the pod boots from a known-good state. Never silently drops work.
15-
# 3. exec the API.
4+
# Minimal boot prep. The data-repo reconciliation state machine
5+
# (in-sync / behind / ahead / diverged-clean-rebase / diverged-conflict-escape)
6+
# lives in the Node API process now — see apps/api/src/store/reconcile.ts
7+
# and apps/api/src/plugins/reconcile.ts. This script only ensures:
8+
#
9+
# 1. The PVC mount at CFP_DATA_REPO_PATH is trusted by git regardless of
10+
# file-ownership (PVCs survive pod restarts and may carry files owned
11+
# by a different uid than the current runAsUser).
12+
# 2. A reasonable git user identity is configured for any rebase committer
13+
# writes (rebase preserves authors of replayed commits; the committer
14+
# line is the only thing that can pick up runtime identity).
15+
# 3. There IS a valid `.git` working tree at CFP_DATA_REPO_PATH. On first
16+
# pod boot (empty PVC), we do an initial full-history clone. On
17+
# subsequent boots, the reconciler inside the API decides what to do.
1618
#
1719
# Required env:
1820
# CFP_DATA_REPO_PATH — local working-tree path (mounted PVC in k8s)
1921
# Optional env:
2022
# CFP_DATA_REMOTE — git URL to clone/fetch/push. If unset, the entrypoint
2123
# assumes an offline-style dev setup and uses whatever
2224
# working tree is already at CFP_DATA_REPO_PATH.
23-
# CFP_DATA_BRANCH — branch to track (default: main).
25+
# CFP_DATA_BRANCH — branch to clone initially (default: main).
2426
# GIT_SSH_COMMAND — set when an SSH deploy key is mounted.
25-
#
26-
# Failure modes:
27-
# - Fetch failures are non-fatal — log + continue with local state. The
28-
# push-daemon retries on its schedule.
29-
# - Push failures during reconciliation are non-fatal — the push-daemon
30-
# retries once the API starts.
31-
# - Rebase conflicts trigger the escape hatch (conflict branch + hard reset).
32-
# The API still boots; the operator investigates the named branch.
33-
# - Anything else (clone failure, etc.) crashes the container; k8s restarts.
3427

3528
set -eu
3629

@@ -47,144 +40,48 @@ DATA_BRANCH="${CFP_DATA_BRANCH:-main}"
4740
# runAsUser (e.g., an earlier iteration ran as root).
4841
git config --global --add safe.directory "$CFP_DATA_REPO_PATH"
4942

50-
# Identity for any direct git operations made by the entrypoint (rebase
51-
# preserves authors of existing commits; this just covers the committer when
52-
# rebase actually rewrites a commit). API mutations supply their own GIT_AUTHOR_*
53-
# via gitsheets transaction options.
43+
# Pseudonymous identity for any direct git operations that pick up the
44+
# runtime committer line. API mutations supply their own GIT_AUTHOR_* via
45+
# gitsheets transaction options; the reconciler re-applies these to the
46+
# repo-local config too, so this is belt-and-suspenders for any other tool
47+
# that touches the tree.
5448
: "${GIT_AUTHOR_NAME:=CodeForPhilly API}"
5549
: "${GIT_AUTHOR_EMAIL:=api@users.noreply.codeforphilly.org}"
5650
: "${GIT_COMMITTER_NAME:=$GIT_AUTHOR_NAME}"
5751
: "${GIT_COMMITTER_EMAIL:=$GIT_AUTHOR_EMAIL}"
5852
export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_COMMITTER_NAME GIT_COMMITTER_EMAIL
5953

60-
# ---------------------------------------------------------------------------
61-
# Reconcile against origin. Returns 0 on success or a soft failure; only
62-
# unrecoverable filesystem/clone errors propagate via `set -e`.
63-
# ---------------------------------------------------------------------------
64-
reconcile() {
65-
cd "$CFP_DATA_REPO_PATH"
66-
67-
git config user.name "$GIT_AUTHOR_NAME"
68-
git config user.email "$GIT_AUTHOR_EMAIL"
69-
git remote set-url origin "$CFP_DATA_REMOTE"
70-
71-
# Unshallow if a previous clone used --depth=1; the reconciliation logic
72-
# below needs the merge-base to be reachable.
73-
if [ -f .git/shallow ]; then
74-
log "unshallowing existing clone (needed for rebase)"
75-
git fetch --unshallow origin "$DATA_BRANCH" 2>&1 | sed 's/^/ /' || \
76-
log "WARN: --unshallow failed; continuing with shallow history"
77-
fi
78-
79-
if ! git fetch --prune origin "$DATA_BRANCH" 2>&1 | sed 's/^/ /'; then
80-
log "WARN: fetch failed; skipping reconciliation, using local state"
81-
return 0
82-
fi
83-
84-
# Ensure we're on the branch.
85-
if git rev-parse --verify "refs/heads/$DATA_BRANCH" >/dev/null 2>&1; then
86-
git checkout "$DATA_BRANCH" 2>&1 | sed 's/^/ /'
87-
else
88-
git checkout -b "$DATA_BRANCH" "origin/$DATA_BRANCH" 2>&1 | sed 's/^/ /'
89-
fi
90-
91-
LOCAL=$(git rev-parse HEAD)
92-
REMOTE=$(git rev-parse "origin/$DATA_BRANCH")
93-
if ! BASE=$(git merge-base HEAD "origin/$DATA_BRANCH" 2>/dev/null); then
94-
log "WARN: no merge-base with origin/$DATA_BRANCH; resetting to origin"
95-
git reset --hard "origin/$DATA_BRANCH" 2>&1 | sed 's/^/ /'
96-
return 0
97-
fi
98-
99-
if [ "$LOCAL" = "$REMOTE" ]; then
100-
log "in sync with origin/$DATA_BRANCH"
101-
return 0
102-
fi
103-
104-
if [ "$LOCAL" = "$BASE" ]; then
105-
log "behind origin/$DATA_BRANCH — fast-forwarding"
106-
git merge --ff-only "origin/$DATA_BRANCH" 2>&1 | sed 's/^/ /'
107-
return 0
108-
fi
109-
110-
if [ "$REMOTE" = "$BASE" ]; then
111-
AHEAD=$(git rev-list --count "origin/$DATA_BRANCH..HEAD")
112-
log "ahead of origin/$DATA_BRANCH by ${AHEAD} commit(s) — pushing"
113-
if git push origin "$DATA_BRANCH" 2>&1 | sed 's/^/ /'; then
114-
log "push succeeded"
115-
else
116-
log "WARN: push failed; push-daemon will retry once API starts"
117-
fi
118-
return 0
54+
if [ ! -d "$CFP_DATA_REPO_PATH/.git" ]; then
55+
if [ -z "${CFP_DATA_REMOTE:-}" ]; then
56+
log "ERROR: $CFP_DATA_REPO_PATH is not a git repo and CFP_DATA_REMOTE is unset"
57+
exit 1
11958
fi
12059

121-
# Diverged: local has commits that origin doesn't AND origin has commits
122-
# that local doesn't. Attempt a rebase; if it conflicts, escape-hatch.
123-
AHEAD=$(git rev-list --count "origin/$DATA_BRANCH..HEAD")
124-
BEHIND=$(git rev-list --count "HEAD..origin/$DATA_BRANCH")
125-
log "diverged from origin/$DATA_BRANCH (ahead=${AHEAD}, behind=${BEHIND}) — rebasing"
126-
127-
if git rebase "origin/$DATA_BRANCH" 2>&1 | sed 's/^/ /'; then
128-
log "rebase clean — pushing"
129-
if git push origin "$DATA_BRANCH" 2>&1 | sed 's/^/ /'; then
130-
log "push succeeded"
131-
else
132-
log "WARN: push failed; push-daemon will retry once API starts"
133-
fi
134-
return 0
135-
fi
60+
mkdir -p "$CFP_DATA_REPO_PATH"
13661

137-
# Conflict — escape hatch.
138-
CONFLICT_BRANCH="conflicts/$(date -u +%Y-%m-%dT%H-%M-%SZ)"
139-
log "ERROR: rebase conflict on $DATA_BRANCH — invoking escape hatch"
140-
git rebase --abort 2>&1 | sed 's/^/ /' || true
141-
log "preserving pre-rebase HEAD ($LOCAL) at $CONFLICT_BRANCH"
142-
git branch "$CONFLICT_BRANCH" "$LOCAL"
143-
if git push origin "$CONFLICT_BRANCH" 2>&1 | sed 's/^/ /'; then
144-
log "pushed $CONFLICT_BRANCH to origin — operator must investigate"
145-
else
146-
log "WARN: failed to push $CONFLICT_BRANCH; diverged commits preserved only in this PVC's reflog"
62+
# PVC may carry residue from a previous pod that bailed mid-clone.
63+
# `git clone` refuses to clone into a non-empty directory, so wipe it
64+
# first. Safe because the data repo is always re-cloneable.
65+
if [ -n "$(ls -A "$CFP_DATA_REPO_PATH" 2>/dev/null)" ]; then
66+
log "$CFP_DATA_REPO_PATH non-empty but lacks .git — wiping before clone"
67+
find "$CFP_DATA_REPO_PATH" -mindepth 1 -maxdepth 1 -exec rm -rf {} +
14768
fi
148-
log "resetting $DATA_BRANCH to origin/$DATA_BRANCH"
149-
git reset --hard "origin/$DATA_BRANCH" 2>&1 | sed 's/^/ /'
150-
return 0
151-
}
15269

153-
if [ -z "${CFP_DATA_REMOTE:-}" ]; then
154-
if [ -d "$CFP_DATA_REPO_PATH/.git" ]; then
155-
log "CFP_DATA_REMOTE unset; using existing working tree at $CFP_DATA_REPO_PATH"
156-
cd "$CFP_DATA_REPO_PATH"
157-
git config user.name "$GIT_AUTHOR_NAME"
158-
git config user.email "$GIT_AUTHOR_EMAIL"
159-
cd - >/dev/null
160-
else
161-
log "ERROR: CFP_DATA_REMOTE is unset and $CFP_DATA_REPO_PATH is not a git repo"
162-
exit 1
163-
fi
164-
else
165-
mkdir -p "$CFP_DATA_REPO_PATH"
70+
log "cloning $CFP_DATA_REMOTE into $CFP_DATA_REPO_PATH (branch=$DATA_BRANCH)"
71+
# Full history (no --depth) so the API-side reconciler can rebase against
72+
# any realistic divergence on subsequent boots.
73+
git clone --branch "$DATA_BRANCH" "$CFP_DATA_REMOTE" "$CFP_DATA_REPO_PATH"
74+
fi
16675

167-
if [ -d "$CFP_DATA_REPO_PATH/.git" ]; then
168-
log "reconciling existing data repo at $CFP_DATA_REPO_PATH (branch=$DATA_BRANCH)"
169-
reconcile
170-
cd - >/dev/null || true
171-
else
172-
# PVC may carry residue from a previous pod that bailed mid-clone.
173-
# `git clone` refuses to clone into a non-empty directory, so wipe it
174-
# first. Safe because the data repo is always re-cloneable.
175-
if [ -n "$(ls -A "$CFP_DATA_REPO_PATH" 2>/dev/null)" ]; then
176-
log "$CFP_DATA_REPO_PATH non-empty but lacks .git — wiping before clone"
177-
find "$CFP_DATA_REPO_PATH" -mindepth 1 -maxdepth 1 -exec rm -rf {} +
178-
fi
179-
log "cloning $CFP_DATA_REMOTE into $CFP_DATA_REPO_PATH (branch=$DATA_BRANCH)"
180-
# Full history (no --depth) so subsequent reconciliations can rebase.
181-
git clone --branch "$DATA_BRANCH" "$CFP_DATA_REMOTE" "$CFP_DATA_REPO_PATH"
182-
cd "$CFP_DATA_REPO_PATH"
183-
git config user.name "$GIT_AUTHOR_NAME"
184-
git config user.email "$GIT_AUTHOR_EMAIL"
185-
cd - >/dev/null
186-
fi
76+
cd "$CFP_DATA_REPO_PATH"
77+
git config user.name "$GIT_AUTHOR_NAME"
78+
git config user.email "$GIT_AUTHOR_EMAIL"
79+
# Ensure the origin URL matches the current env (in case CFP_DATA_REMOTE
80+
# was rotated). Idempotent.
81+
if [ -n "${CFP_DATA_REMOTE:-}" ] && git remote get-url origin >/dev/null 2>&1; then
82+
git remote set-url origin "$CFP_DATA_REMOTE"
18783
fi
84+
cd - >/dev/null
18885

189-
log "data repo ready; starting API"
86+
log "data repo ready; starting API (reconciliation runs inside the API process)"
19087
exec "$@"

docs/operations/deploy.md

Lines changed: 43 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -101,26 +101,52 @@ curl http://localhost:3001/ # SPA index.html
101101

102102
## Boot sequence
103103

104-
The container entrypoint (`deploy/docker/entrypoint.sh`) reconciles the
105-
data-repo working tree with origin before exec'ing the API. See the
106-
"Smart entrypoint reconciliation" commit message in `git log
107-
deploy/docker/entrypoint.sh` for the full state machine; in short:
108-
109-
- in sync → no-op
110-
- behind → fast-forward
111-
- ahead → push (push daemon retries on failure)
112-
- diverged + clean rebase → rebase + push
113-
- diverged + conflicts → push a `conflicts/<UTC-timestamp>` branch to origin
114-
and hard-reset local to origin
104+
The container entrypoint (`deploy/docker/entrypoint.sh`) only handles the
105+
bits that *must* run before the Node process exists:
106+
107+
- Trusts the PVC mount via `git config --global safe.directory`.
108+
- Sets a pseudonymous git identity (`CodeForPhilly API
109+
<api@users.noreply.codeforphilly.org>`) for any committer line a future
110+
rebase might write.
111+
- On first pod boot — and only then — does a full-history `git clone` of
112+
`CFP_DATA_REMOTE` into `CFP_DATA_REPO_PATH` when no `.git` directory
113+
exists. On subsequent boots the PVC already holds a clone; no clone is
114+
performed.
115+
- Refreshes `origin`'s URL to whatever `CFP_DATA_REMOTE` is set to (lets
116+
operators rotate the remote without re-cloning the PVC).
117+
- `exec`s the API. That's all — about a dozen lines of shell now.
115118

116119
Then `exec node apps/api/dist/index.js`. Inside node, `buildApp()` registers
117120
plugins ([apps/api/src/app.ts](../../apps/api/src/app.ts)) in order: env →
118121
CORS → cookies → trace IDs → error mapper → **store** (loads public +
119-
private into memory) → **push daemon** (starts pushing transact'd commits to
122+
private into memory) → **reconcile** (fetch + ff/rebase/escape-hatch against
123+
origin — see below) → **push daemon** (starts pushing transact'd commits to
120124
`CFP_DATA_REMOTE`) → services (FTS) → rate limit → idempotency → session
121125
middleware → swagger → routes → static SPA. Fastify's `listen()` doesn't
122126
fire until all of those resolve, so once `/api/health/ready` returns 200
123-
both stores have loaded.
127+
both stores have loaded **and** the working tree has been reconciled with
128+
origin.
129+
130+
### Reconciliation state machine
131+
132+
Lives in [`apps/api/src/store/reconcile.ts`](../../apps/api/src/store/reconcile.ts)
133+
and is invoked at boot by the reconcile plugin. Same state machine the
134+
shell used to run, just structured Node so exit codes propagate naturally
135+
and the same code is reusable from the future hot-reload webhook (#65):
136+
137+
- in sync → no-op (`'in-sync'`)
138+
- behind → fast-forward (`'fast-forwarded'`)
139+
- ahead → push (`'pushed-ahead'`; push daemon retries on push failure)
140+
- diverged + clean rebase → rebase + push (`'rebased'`)
141+
- diverged + conflicts → abort rebase, create + push a
142+
`conflicts/<UTC-timestamp>` branch from the pre-rebase HEAD, hard-reset
143+
local to origin (`'conflict-escaped'`; logged at ERROR level so operators
144+
see it in production logs)
145+
- fetch itself fails (network blip) → log warn, continue with local state
146+
(`'fetch-failed'`)
147+
148+
When `CFP_DATA_REMOTE` is unset (typical local dev), the reconcile plugin
149+
skips reconciliation entirely.
124150

125151
## Probes
126152

@@ -133,9 +159,10 @@ both stores have loaded.
133159
## Data repo on disk
134160

135161
The API operates on a working tree at `/app/data` backed by a PVC. The
136-
entrypoint reconciles that tree with `CFP_DATA_REMOTE` on every boot; the
137-
push daemon pushes commits made during the pod's lifetime back to the
138-
remote.
162+
entrypoint ensures the working tree exists (cloning on first boot); the
163+
API-side reconcile plugin then synchronizes that tree with `CFP_DATA_REMOTE`
164+
on every boot, and the push daemon pushes commits made during the pod's
165+
lifetime back to the remote.
139166

140167
Implications:
141168

0 commit comments

Comments
 (0)