Skip to content

P0: Fix Truncated Card Text and Repetitive Filler #161

@kevsilk597

Description

@kevsilk597

Problem

Card text is garbage. Examples from today's cards:

Truncated sentences:

"Heat are home. Bam's best basketball happens when Miami's in the American Airlines"

"Shai and Chet need everyone fresh for the."

"Key signal: This stat has moved consistently over the last three games and sits outside the."

Repetitive filler appearing 2-3x per card:

"Three-game rolling averages confirm the direction across pace, role, and opponent caliber."

Nonsensical template fragments:

"Market Pressure: line behavior near tip 1"
"Form Window: recent game sample 1"

This is not professional content. This cannot be shown to paying customers.

Root Cause

  1. Card generation prompts or templates are producing truncated output
  2. Filler text is being injected without deduplication
  3. Raw template variables are leaking into final text

Requirements

1. Text Completeness Validation

def validate_text_completeness(text: str) -> tuple[bool, list[str]]:
    """Check for truncated or broken text."""
    issues = []
    
    # Check for sentences ending mid-word or abruptly
    sentences = text.split(". ")
    for sentence in sentences:
        if sentence and not sentence[-1] in ".!?\"'":
            if len(sentence.split()) > 3:  # Not just a short phrase
                issues.append(f"Truncated sentence: ...{sentence[-50:]}")
    
    # Check for repeated phrases
    phrases = extract_phrases(text, min_words=5)
    phrase_counts = Counter(phrases)
    for phrase, count in phrase_counts.items():
        if count > 1:
            issues.append(f"Repeated phrase ({count}x): {phrase[:50]}...")
    
    # Check for template leakage
    template_patterns = [
        r"\*\*\d+\*\*$",  # Ends with **1** or similar
        r"line behavior near tip \*\*\d+\*\*",
        r"recent game sample \*\*\d+\*\*",
    ]
    for pattern in template_patterns:
        if re.search(pattern, text):
            issues.append(f"Template leakage: {pattern}")
    
    return len(issues) == 0, issues

2. Block Cards with Text Issues

# In quality gate
is_complete, text_issues = validate_text_completeness(card["context"])
if not is_complete:
    log(f"[QUALITY_GATE] BLOCKED {card['id']}: text issues: {text_issues}")
    return False

3. Fix the Source of Truncation

Find where card text is generated and fix:

  • If using LLM: increase max_tokens or check for truncation
  • If using templates: ensure all variables are populated
  • If concatenating: ensure proper sentence boundaries

4. Deduplicate Content Before Output

def deduplicate_card_content(card: dict) -> dict:
    """Remove repeated phrases from card text fields."""
    for field in ["context", "why_it_matters", "insight"]:
        if field in card and isinstance(card[field], str):
            card[field] = remove_duplicate_sentences(card[field])
        if field in card and isinstance(card[field], list):
            card[field] = list(dict.fromkeys(card[field]))  # Preserve order, remove dupes
    return card

Validation

After deploy:

  • Generate cards
  • Verify no card has a sentence ending without punctuation
  • Verify no card has the same phrase repeated twice
  • Verify no card contains "1" or similar template fragments
  • Read 10 random cards — all should be coherent, complete sentences

Definition of Done

  • Text completeness validation implemented
  • Cards with truncated text are blocked
  • Cards with repeated phrases are blocked
  • Template leakage is detected and blocked
  • Source of truncation identified and fixed
  • All card text reads as professional, complete sentences

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions