Single Source of Truth: Only data/latest/ and data/batches/ (via influx-harvest pipeline) contain production data. All analysis files must be in .cccc/work/scratch/ or archive/manual_analysis/.
- Archived Manual Files: Moved 10 stale .jsonl files from root to
archive/manual_analysis/ - Updated POR Targets: Adjusted from 250→150-200 authors reflecting 84 baseline
- Cleaned Root Directory: Removed temporary analysis artifacts
data/latest/latest.jsonl- Current dataset (84 authors)data/latest/manifest.json- Dataset metadatadata/batches/- influx-harvest processed batches onlydata/release/- Production releases
.cccc/work/scratch/- Active analysis filesarchive/manual_analysis/- Completed analysis- Root directory: NO .jsonl files allowed
- All production data must pass
influx-validate --strict - Provenance hashes required for all records
- Single-path pipeline through influx-harvest mandatory
- No Manual .jsonl in Root: Prevents confusion with production data
- Archive Policy: Analysis files moved after completion
- Pipeline Guard: Only influx-harvest outputs in production
- Strict Validation: 100% compliance required
- Provenance Tracking: SHA256 hashes for audit trail
- Single Source: Eliminates duplicate/conflicting data
- Process M11 Batch: 22 qualified authors through influx-harvest
- Update Documentation: Add policy to PROJECT.md
- Reach 106 Authors: First milestone in rebuild
- Process M13 Security: 10 qualified authors
- Implement Workspace Policy: Formalize analysis file handling
- Reach 150 Authors: Realistic target from clean baseline
- ✅ Production Data: 84 authors, 100% validated
- ✅ Workspace Clean: All manual files archived
- ✅ Policy Updated: POR targets aligned with reality
- ✅ Pipeline Ready: influx-harvest functional
Policy Implementation Complete: 2025-11-23T02:15:00Z Status: Ready for accelerated domain batch processing