Skip to content

P0: Card Quality Audit Findings - Verification Not Running, Text Still Broken #166

@kevsilk597

Description

@kevsilk597

P0: Card Quality Audit Findings - Verification Not Running, Text Still Broken

Summary

Reviewed the March 24 deploy. Pipeline works. Volume is up. But verification is not actually verifying claims, and text quality issues persist. This issue documents all findings and required fixes.


Issue 1: Stat Verification Is Not Running (P0)

Problem

Every card shows:

"verification": {
    "score": 100,
    "checks_total": 4,
    "claims_total": 0,
    "claims_verified": 0,
    "claims_failed": 0,
    "claim_details": []
}

The claim extraction function is returning zero claims. Cards make verifiable statements like:

  • "HOU is 11-2 post-loss"
  • "RJ Barrett at 31.6 minutes, 13% over his 28.0 season average"
  • "Favorites looking ahead lose 4+ points ATS in trap setups"

But claims_total: 0 means none of these were extracted or verified. The verification score of 100 is meaningless.

Root Cause

The extract_stat_claims() function in stat_verifier.py is either:

  1. Not being called
  2. Regex patterns are not matching real card text
  3. Returning empty list due to a bug

Required Fix

  1. Add logging to extract_stat_claims():
def extract_stat_claims(card: dict) -> list[dict]:
    logger.info(f"[VERIFIER] Extracting claims from card {card.get('id')}")
    logger.info(f"[VERIFIER] Card text: {card.get('context', '')[:200]}")
    
    claims = []
    # ... extraction logic ...
    
    logger.info(f"[VERIFIER] Found {len(claims)} claims: {claims}")
    return claims
  1. Test the regex patterns against actual card text:
# Example card text:
text = "HOU doesn't lose twice — **11-2 record** post-loss is elite bounce-back DNA."

# This should extract:
# - stat_type: "team_record"
# - entity: "HOU"  
# - value: "11-2"
# - condition: "post-loss"
  1. Verify claims are being passed to verify_claim() and results stored

Expected Outcome

After fix:

"verification": {
    "score": 100,
    "claims_total": 3,
    "claims_verified": 3,
    "claims_failed": 0,
    "claim_details": [
        {
            "claim": "HOU 11-2 post-loss",
            "verified": true,
            "query": "SELECT wins, losses FROM team_post_loss_record WHERE team = 'HOU'",
            "db_value": "11-2"
        }
    ]
}

Issue 2: Verify-Card Endpoint Broken (P1)

Problem

GET /api/v1/admin/verify-card?id=trap_game-1690

Response: {"error": "invalid literal for int() with base 10: 'trap_game-1690'"}

The endpoint tries to parse the card ID as an integer, but card IDs are strings like "trap_game-1690".

Required Fix

In server.py, find the verify-card handler:

# BEFORE
card_id = int(request.args.get('id'))

# AFTER  
card_id = request.args.get('id')  # Keep as string

Or if the database uses integer IDs internally:

card_id_str = request.args.get('id')  # e.g., "trap_game-1690"
# Query by the string ID field, not integer primary key
card = db.query("SELECT * FROM intelligence_cards WHERE card_id = %s", [card_id_str])

Expected Outcome

GET /api/v1/admin/verify-card?id=trap_game-1690

Response:
{
    "card_id": "trap_game-1690",
    "claims_found": 3,
    "claims_verified": 3,
    "verification_details": [...]
}

Issue 3: Text Still Truncated/Broken (P0)

Problem

Multiple cards have incomplete sentences:

Kel'el Ware card:

"Heat don't have He's hitting that workload"
"Legs won't be Over his season average"

RJ Barrett card:

"That's 3.6 extra possessions per night — adds Back-to-backs wreck role players"

Old template filler still present:

"Key signal: This stat has moved consistently over the last three games and sits outside the."

Root Cause

  1. Claude max_tokens may still be too low (even after bump to 800)
  2. Post-processing is not catching truncated sentences
  3. Banned phrase list is not removing "Key signal: This stat has moved..."

Required Fix

  1. Increase max_tokens to 1200 for card generation

  2. Add truncation detection before storage:

def is_truncated(text: str) -> bool:
    """Detect if text was cut off mid-sentence."""
    sentences = text.split('.')
    for sentence in sentences:
        sentence = sentence.strip()
        if not sentence:
            continue
        # Check for sentences ending abruptly (no punctuation, mid-word)
        if sentence and sentence[-1] not in '.!?"\'' and len(sentence) > 20:
            return True
        # Check for dangling words
        if sentence.endswith((' the', ' a', ' an', ' is', ' are', ' was', ' has', ' have')):
            return True
    return False
  1. Add to banned phrases:
BANNED_PHRASES.extend([
    "Key signal: This stat has moved consistently",
    "sits outside the season baseline",
    "sits outside the.",
])
  1. Block cards that fail truncation check:
if is_truncated(card['context']):
    logger.error(f"[QUALITY] Blocked {card['id']}: truncated text detected")
    return False

Expected Outcome

  • No card has sentences ending mid-word
  • No card contains "Key signal: This stat has moved..."
  • All sentences are complete and grammatically correct

Issue 4: Factual Error - Wrong Month (P0)

Problem

Trap game card (dated March 23, 2026):

"Home dogs in November after a loss? They come out swinging."

It is March, not November. This is factually wrong.

Root Cause

LLM hallucinated the month, or template text from a different context was injected.

Required Fix

Add date validation to card text:

import re
from datetime import datetime

def validate_date_references(card: dict) -> tuple[bool, str]:
    """Check that any month/date references match the card date."""
    card_date = datetime.strptime(card['cardDate'], '%Y-%m-%d')
    card_month = card_date.strftime('%B')  # e.g., "March"
    
    text = card.get('context', '') + ' '.join(card.get('why_it_matters', []))
    
    # Find month references
    months = ['January', 'February', 'March', 'April', 'May', 'June',
              'July', 'August', 'September', 'October', 'November', 'December']
    
    for month in months:
        if month.lower() in text.lower() and month != card_month:
            return False, f"Card references {month} but card date is {card_month}"
    
    return True, ""

Block cards with wrong month references.

Expected Outcome

  • No card references a month that doesn't match its card date
  • Cards referencing "November" in March are blocked before storage

Issue 5: Duplicate Players Across Dates (P1)

Problem

Player Appearances
Dorian Finney-Smith 3 cards
Rayan Rupert 2 cards
Leaky Black 2 cards

Same player appears in multiple cards across different dates for the same signal type.

Required Fix

Add deduplication at storage time:

def should_store_card(card: dict) -> bool:
    player = card.get('player_name')
    card_type = card.get('cardType')
    
    if player:
        # Check if this player already has this card type in last 3 days
        existing = db.query("""
            SELECT COUNT(*) FROM intelligence_cards
            WHERE player_name = %s 
            AND card_type = %s
            AND card_date > CURRENT_DATE - INTERVAL '3 days'
            AND archived_at IS NULL
        """, [player, card_type])
        
        if existing[0][0] > 0:
            logger.info(f"[DEDUP] Skipping {card_type} for {player}: already has recent card")
            return False
    
    return True

Expected Outcome

  • Each player has at most 1 card per card type per 3-day window
  • No duplicate "RUNNING ON FUMES" cards for the same player

Issue 6: Workload Over-Represented (P1)

Problem

14 of 20 cards (70%) are workload type. Feed lacks variety.

Required Fix

Implement card diversity caps (Issue #160):

CARD_TYPE_CAPS = {
    'workload': 5,
    'hot_streak': 8,
    'cold_streak': 8,
    'trap_game': 5,
    'team_post_loss': 5,
    'rivalry': 3,
}

def filter_by_type_cap(cards: list[dict], date: str) -> list[dict]:
    """Enforce per-type caps, keeping highest verification scores."""
    by_type = defaultdict(list)
    for card in cards:
        by_type[card['cardType']].append(card)
    
    result = []
    for card_type, type_cards in by_type.items():
        cap = CARD_TYPE_CAPS.get(card_type, 10)
        # Sort by verification score descending
        sorted_cards = sorted(type_cards, key=lambda c: c['verificationScore'], reverse=True)
        result.extend(sorted_cards[:cap])
    
    return result

Expected Outcome

  • Max 5 workload cards per day
  • Feed shows variety: trap_game, rivalry, team_post_loss, workload, streaks

Issue 7: Still Bench Player Heavy (P1)

Problem

Current workload cards feature: Leaky Black, Will Richard, Leonard Miller, Jett Howard, Kel'el Ware, Rayan Rupert.

These are bench players averaging 15-25 minutes. No cards for stars.

Diagnosis Needed

Check the diagnostic logs added in the last deploy:

grep "[FILTER]" /var/log/cypher.log | grep -E "LeBron|Curry|Jokic|Tatum|Edwards" | tail -20

Are stars being filtered? Or are they not triggering detector thresholds?

If stars aren't appearing in candidates at all, the detector thresholds may need adjustment.

If stars appear but get filtered, the tier/sigma logic has a bug (tier 1 should always pass).

Expected Outcome

  • At least 30% of player cards feature tier 1 (star) players
  • Diagnostic logs show where star players are in the pipeline

Execution Order

  1. Issue 1 (P0): Fix stat verification - claims must actually be extracted and checked
  2. Issue 3 (P0): Fix text truncation - no more incomplete sentences
  3. Issue 4 (P0): Add date validation - block wrong month references
  4. Issue 2 (P1): Fix verify-card endpoint
  5. Issue 5 (P1): Add player deduplication
  6. Issue 6 (P1): Implement card type caps
  7. Issue 7 (P1): Diagnose why stars aren't generating cards

Definition of Done

  • claims_total > 0 on cards with verifiable stats
  • claim_details array populated with verification queries
  • verify-card endpoint accepts string IDs
  • No card has truncated sentences
  • No card has "Key signal: This stat has moved..." text
  • No card references wrong month
  • No duplicate player+cardType within 3 days
  • Max 5 workload cards per day
  • At least 1 star player card per day (when stars have signals)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions