Skip to content

contest-ingest: add timestamp validation to reject typo'd dates #2

@KI7MT

Description

@KI7MT

Finding

The morning-health-check PCS plugin pilot surfaced 862 contest.bronze rows (0.00037% of 234M) with out-of-band timestamps:

  • Pre-2005 rows (some 1970-01-01 epoch fallbacks, others 1996/1998/2000-2003 — before our intended 2005-2025 contest range)
  • Future-dated rows (2027, 2028, 2047, 2054, 2065, 2080, 2088 — operator typos in Cabrillo logs)

Examples of root cause — source file year vs parsed timestamp year mismatch:

Source Parsed timestamp Likely operator typo
cq-ww/2010cw 1971-03-26 "71" instead of "10"
cq-wpx/2008ph 2080-03-30 "80" instead of "08"
cq-ww/2007cw 1970-03-23 epoch fallback

This is upstream data corruption in the original Cabrillo logs themselves, not a pipeline bug — we faithfully ingested what operators submitted.

Proposed fix

Add ingest-time validation in contest-ingest (or the upstream parser):

  1. Each contest event has a known expected date range (typically a 2-day weekend window). Reject any QSO timestamp outside [contest_start - 7d, contest_end + 7d].
  2. Log rejected rows with full row content + reason so operators can review.
  3. Track rejection count in the ingest summary output.

Scope

  • Affects: contest-download/contest-ingest (Cabrillo parsing path)
  • Bronze rows: 862 already-ingested rows could either stay (validation is forward-only) or be cleaned via a one-shot ALTER TABLE DELETE — recommend keeping for now (bronze is the raw archive layer).
  • Downstream impact: contest.signatures already filters to sane date ranges, so this is hygiene/observability, not data quality.

Discovered by

morning-health-check pilot, first run, 2026-05-17. See KI7MT/fleet-ops/plugins/morning-health-check/README.md finding F-1.

Priority

Low — 0.00037% of rows. Pipeline observability gain rather than data quality blocker.

Labels

infra, bob, deferred

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions