Objective
The pipeline cannot complete a clean run. Every execution fails on the same type mismatch errors. Until this is fixed, we cannot generate cards reliably.
Current Failure
reason_code=step_exception step=mlb_batting attempt=2/2
detail=operator does not exist: text = integer
LINE 1: ...away_team, game_state FROM mlb_games WHERE season = 2026 AND...
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
reason_code=step_exception step=mlb_pitching attempt=2/2
detail=operator does not exist: text = integer
(same issue)
reason_code=step_exception step=nfl_team_analytics attempt=2/2
detail=unsupported operand type(s) for -: 'float' and 'decimal.Decimal'
Root Cause Analysis
MLB Issue
The season column in mlb_games is stored as TEXT, but queries compare it to INTEGER 2026.
PostgreSQL is strict about type comparisons. SQLite was lenient.
Find all occurrences:
grep -rn "season = 2026" --include="*.py" .
grep -rn "season=2026" --include="*.py" .
grep -rn "WHERE season" --include="*.py" .
NFL Issue
Python float cannot do arithmetic with PostgreSQL decimal.Decimal returned from queries.
Find all occurrences:
grep -rn "decimal\|Decimal" --include="*.py" .
# Look for arithmetic operations on query results
Requirements
1. Fix MLB Type Mismatch
Option A (preferred): Fix the queries to use string comparison:
# Before
WHERE season = 2026
# After
WHERE season = '2026'
Option B: Cast in the query:
WHERE season::integer = 2026
Option C: Fix the column type (requires migration):
ALTER TABLE mlb_games ALTER COLUMN season TYPE integer USING season::integer;
Recommend Option A - fastest, no migration needed.
2. Fix NFL Decimal Issue
When fetching numeric data from PostgreSQL, convert Decimal to float before arithmetic:
# Before
result = row[0] - row[1] # Fails if one is Decimal, one is float
# After
result = float(row[0]) - float(row[1])
Or configure psycopg2 to return floats:
import psycopg2.extensions
psycopg2.extensions.register_type(psycopg2.extensions.new_type(
psycopg2.extensions.DECIMAL.values,
'DEC2FLOAT',
lambda value, curs: float(value) if value is not None else None
))
3. Audit All Queries for Type Safety
Search the entire codebase for potential type mismatches:
# Find all raw SQL queries
grep -rn "SELECT\|INSERT\|UPDATE\|WHERE" --include="*.py" . | grep -v test
# For each query that compares to a Python variable, verify types match
Create a checklist of every table/column that stores numbers as TEXT and ensure all queries handle this.
4. Add Type Safety Tests
Create a test that runs each pipeline step individually and catches type errors:
def test_pipeline_steps_type_safe():
for step_name, step_fn in STEP_MAP.items():
try:
# Run with minimal data to catch type errors
step_fn(dry_run=True)
except TypeError as e:
pytest.fail(f"Step {step_name} has type error: {e}")
Validation
After fixes are deployed:
- Trigger a full pipeline run:
curl http://cypher.178.156.223.137.nip.io/api/v1/trigger
- Monitor logs for 30 minutes:
ssh root@178.156.223.137 "tail -f /var/log/cypher.log | grep -E 'step_exception|TypeError|operator does not exist'"
- Verify pipeline completes:
curl http://cypher.178.156.223.137.nip.io/health | jq ".pipeline.last_run"
Expected: "status": "completed", "errors": null
Definition of Done
Objective
The pipeline cannot complete a clean run. Every execution fails on the same type mismatch errors. Until this is fixed, we cannot generate cards reliably.
Current Failure
Root Cause Analysis
MLB Issue
The
seasoncolumn inmlb_gamesis stored as TEXT, but queries compare it to INTEGER2026.PostgreSQL is strict about type comparisons. SQLite was lenient.
Find all occurrences:
NFL Issue
Python
floatcannot do arithmetic with PostgreSQLdecimal.Decimalreturned from queries.Find all occurrences:
Requirements
1. Fix MLB Type Mismatch
Option A (preferred): Fix the queries to use string comparison:
Option B: Cast in the query:
Option C: Fix the column type (requires migration):
Recommend Option A - fastest, no migration needed.
2. Fix NFL Decimal Issue
When fetching numeric data from PostgreSQL, convert Decimal to float before arithmetic:
Or configure psycopg2 to return floats:
3. Audit All Queries for Type Safety
Search the entire codebase for potential type mismatches:
Create a checklist of every table/column that stores numbers as TEXT and ensure all queries handle this.
4. Add Type Safety Tests
Create a test that runs each pipeline step individually and catches type errors:
Validation
After fixes are deployed:
ssh root@178.156.223.137 "tail -f /var/log/cypher.log | grep -E 'step_exception|TypeError|operator does not exist'"Expected:
"status": "completed","errors": nullDefinition of Done
mlb_battingstep completes without type errorsmlb_pitchingstep completes without type errorsnfl_team_analyticsstep completes without type errorsstatus: completed