Engineering Update: PostgreSQL Constraint Coverage + Pipeline Fixes (March 23)

## Engineering Update - Last 18 Hours

### 1. The Core Problem

After the hardening plan was approved and executed, the pipeline hit a cascade of failures that all traced back to one systemic issue: the SQLite-to-PostgreSQL migration transferred all the data but created zero UNIQUE constraints on any table. In SQLite, `ON CONFLICT` works loosely. In PostgreSQL, every `ON CONFLICT(col1, col2)` clause requires an actual UNIQUE index on those exact columns or the INSERT silently fails. This meant every single upsert in the entire pipeline was broken - games were not being inserted, stats were not being inserted, NHL data was not being inserted. The pipeline would "complete" with zero errors but store zero data.

---

### 2. What Was Fixed (Chronological)

#### Round 1: Initial PostgreSQL Compatibility (`e57edb5`)
- **`gtd/storage/db.py`**: Added automatic `ROUND((expr)::numeric, n)` casting in the SQL converter. PostgreSQL does not allow `ROUND(double precision, integer)` - it requires a `numeric` type. This was failing the `player_tiers` step.
- **`gtd/analysis/anomaly_detector.py`**: Added `entity_name` to a `GROUP BY` clause. PostgreSQL is stricter than SQLite about requiring all non-aggregated SELECT columns in GROUP BY.
- **`gtd/storage/database.py`**: Expanded `validate_schema()` from 3 constraint specs to 25, covering tables used by the analysis pipeline (clutch, referee_history, b2b, rest_advantage, etc.).

#### Round 2: Resilient DDL (`f323cdf`)
- **`gtd/storage/database.py`**: Added a `_safe_ddl()` helper function. In PostgreSQL, if a single DDL statement fails inside a transaction, the entire transaction aborts and all subsequent statements fail with "current transaction is aborted." The old code had `ALTER TABLE injuries ADD COLUMN sport ...` which would fail because the column already existed, and this cascaded to abort `validate_schema()` entirely. Now every DDL statement runs in its own mini-transaction with automatic rollback on failure.
- Changed all `ALTER TABLE ADD COLUMN` statements to use `IF NOT EXISTS`.

#### Round 3: player_baselines constraint (`5e14f24`)
- Corrected the `validate_schema()` entry for `player_baselines` from `UNIQUE(player_name)` to `UNIQUE(player_id)`. The anomalies step uses `ON CONFLICT(player_id)`, not `player_name`. Manually deduplicated the table on the server before adding the correct constraint.

#### Round 4: Event monitor and card counts (`f7c49ba`)
- **`gtd/realtime/event_monitor.py`**: Fixed a SQL query that selected `home_team` alongside `SUM()` and `COUNT(*)` without including `home_team` in the GROUP BY. SQLite allows this; PostgreSQL does not. This was causing `[RT] Monitor error` every 15 minutes.
- **`server.py`**: The `_check_per_sport_card_counts()` function queried `WHERE sport = 'NBA'` on the `intelligence_cards` table, but that table has no `sport` column. Replaced with a simple total card count check.

#### Round 5: Full Constraint Coverage + Step Ordering (`186047b`)

This is where the real investigation happened. After all the previous fixes, the pipeline was still producing 0 games. Deep log analysis and code tracing revealed four root causes:

**(a) Deployed cli.py was massively outdated.** The server had a 1,608-line version of `gtd/cli.py`. The current version is 3,915 lines. The old version was missing `STEP_MAP` entries for: `nhl_stats_backfill`, `nhl_skater_baselines`, `nhl_goalie_baselines`, `nhl_power_play`, `nhl_penalty_kill`, `detect_cards`, `career_totals`, all `nfl_*` sub-steps, all `mlb_*` sub-steps. Every missing step was logged as "Unknown step" and silently skipped. The full `gtd/` directory was resynced to the server.

**(b) Core tables had zero UNIQUE constraints.** Comprehensive audit scanning every Python file for `ON CONFLICT(...)` patterns and cross-referencing with `pg_constraint` found **76 missing UNIQUE constraints** across all tables. The critical ones:
- `games` needed `UNIQUE(game_date, home_team, away_team)` - without this, zero NBA games could be inserted
- `player_stats` needed `UNIQUE(game_id, player_name)`
- `nhl_games` needed `UNIQUE(game_id)` - had duplicate rows that had to be cleaned first
- `nhl_skater_stats` needed `UNIQUE(game_id, player_id)`
- `nhl_teams` needed `UNIQUE(abbreviation)`
- `nhl_players` needed `UNIQUE(player_id)`
- `odds` needed `UNIQUE(game_date, home_team, away_team, source)`
- `statements` needed `UNIQUE(source_handle, content)`
- Plus 60+ more across NFL, MLB, NHL analysis, and feature tables

All 76 were batch-added. 63 succeeded immediately, 4 needed deduplication first (done manually), 9 referenced columns that don't exist in those tables (non-critical edge tables). `validate_schema()` was expanded from 25 to 47 constraint specs so these are auto-checked and auto-created on every server startup.

**(c) Step ordering was wrong.** `CORE_DATA_STEPS` in `server.py` started with `nba_players, nfl_games, nfl_schedule, nfl_standings, nfl_odds, nfl_players, nfl_stats, nfl_team_analytics, nfl_situational, nfl_injuries, nfl_baselines, mlb_games, mlb_batting, mlb_pitching, mlb_player_baselines` - 15 slow API-calling steps before the `games` step. A full pipeline took 45+ minutes before it even attempted to fetch NBA games. Reordered to: `games, stats, player_splits, injuries, odds, props` first, then NHL, then analysis, then MLB/NFL last.

**(d) Two SQL bugs.** Event monitor GROUP BY and card counts sport column (described in Round 4).

---

### 3. Current State

| Metric | Before | After |
|---|---|---|
| Intelligence cards | 84 | 146 |
| Games table latest date | March 21 | March 23 |
| NHL sub-steps running | 0 of 6 | All 6 |
| Pipeline step coverage | ~60% (old cli.py) | 100% (full cli.py) |
| UNIQUE constraints | ~25 | 110+ |
| Pipeline step order | NFL/MLB first (45 min to NBA) | NBA first (~2 min to games) |

**March 23 pipeline output:**
- 12 NBA games stored
- 10 odds records, 327 player props
- 1 NHL game, 32 standings, 33 teams, 816 players synced
- 77 goalie baselines updated
- 46 cards detected

---

### 4. What Jarvis Should Review

1. **Constraint completeness**: Run the audit script again to verify no new `ON CONFLICT` patterns have been introduced without matching constraints.
2. **Pipeline output**: Trigger a manual run (`curl http://cypher.178.156.223.137.nip.io/api/v1/trigger`) and watch `/var/log/cypher.log` for any remaining "Unknown step", "ON CONFLICT", or "InFailedSqlTransaction" errors.
3. **Card quality**: Check the 146 cards in the database - are they passing validation, relevant, and accurate?
4. **NHL skater stats freshness**: `nhl_skater_stats` latest date is March 18 (5 days old). The `nhl_stats_backfill` step scans 30 games but stored 0 skater rows. Needs investigation.
5. **NFL games staleness**: Shows Feb 8 as latest. Expected (offseason), but confirm whether `CORE_BLOCKING_TABLES` should include `nfl_games` during offseason - it flags the system as "degraded."
6. **Health endpoint**: Currently returns "degraded" because of stale NFL/MLB offseason tables. Consider whether the staleness threshold should be configurable per sport/season.

---

### 5. Files Changed

| File | What Changed |
|---|---|
| `gtd/storage/database.py` | `validate_schema()` expanded to 47 specs; `_safe_ddl()` helper; `IF NOT EXISTS` for ALTER TABLE |
| `gtd/storage/db.py` | `ROUND()` auto-cast to `::numeric`; `INSERT OR REPLACE` to `ON CONFLICT DO UPDATE SET` fix |
| `gtd/analysis/anomaly_detector.py` | `entity_name` added to GROUP BY |
| `gtd/realtime/event_monitor.py` | Removed non-aggregated `home_team` from SELECT with aggregates |
| `server.py` | `CORE_DATA_STEPS` reordered; card count query fixed |
| `gtd/cli.py` | Already had correct code locally, was just not deployed to server |

### 6. Commits

```
186047b fix: complete PostgreSQL constraint coverage and step ordering
f7c49ba fix: event_monitor GROUP BY and card counts missing sport column
5e14f24 fix: correct player_baselines constraint to use player_id
f323cdf fix: make ensure_all_tables resilient to PostgreSQL transaction aborts
e57edb5 fix: PostgreSQL pipeline compatibility - ROUND cast, GROUP BY, missing constraints
```

File	What Changed
`gtd/storage/database.py`	`validate_schema()` expanded to 47 specs; `_safe_ddl()` helper; `IF NOT EXISTS` for ALTER TABLE
`gtd/storage/db.py`	`ROUND()` auto-cast to `::numeric`; `INSERT OR REPLACE` to `ON CONFLICT DO UPDATE SET` fix
`gtd/analysis/anomaly_detector.py`	`entity_name` added to GROUP BY
`gtd/realtime/event_monitor.py`	Removed non-aggregated `home_team` from SELECT with aggregates
`server.py`	`CORE_DATA_STEPS` reordered; card count query fixed
`gtd/cli.py`	Already had correct code locally, was just not deployed to server

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Engineering Update: PostgreSQL Constraint Coverage + Pipeline Fixes (March 23) #156

Engineering Update - Last 18 Hours

1. The Core Problem

2. What Was Fixed (Chronological)

Round 1: Initial PostgreSQL Compatibility (`e57edb5`)

Round 2: Resilient DDL (`f323cdf`)

Round 3: player_baselines constraint (`5e14f24`)

Round 4: Event monitor and card counts (`f7c49ba`)

Round 5: Full Constraint Coverage + Step Ordering (`186047b`)

3. Current State

4. What Jarvis Should Review

5. Files Changed

6. Commits

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Metric	Before	After
Intelligence cards	84	146
Games table latest date	March 21	March 23
NHL sub-steps running	0 of 6	All 6
Pipeline step coverage	~60% (old cli.py)	100% (full cli.py)
UNIQUE constraints	~25	110+
Pipeline step order	NFL/MLB first (45 min to NBA)	NBA first (~2 min to games)

Engineering Update: PostgreSQL Constraint Coverage + Pipeline Fixes (March 23) #156

Description

Engineering Update - Last 18 Hours

1. The Core Problem

2. What Was Fixed (Chronological)

Round 1: Initial PostgreSQL Compatibility (e57edb5)

Round 2: Resilient DDL (f323cdf)

Round 3: player_baselines constraint (5e14f24)

Round 4: Event monitor and card counts (f7c49ba)

Round 5: Full Constraint Coverage + Step Ordering (186047b)

3. Current State

4. What Jarvis Should Review

5. Files Changed

6. Commits

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Round 1: Initial PostgreSQL Compatibility (`e57edb5`)

Round 2: Resilient DDL (`f323cdf`)

Round 3: player_baselines constraint (`5e14f24`)

Round 4: Event monitor and card counts (`f7c49ba`)

Round 5: Full Constraint Coverage + Step Ordering (`186047b`)