Skip to content

aRealGem/stockflow

Stockflow - System Overview and Test Coverage

What the system does
- Pipeline CLI
  - init-qc: Runs pre-flight QC on originals. Moves files into out/init_qc/passed or failed and writes init_report.json. Annotates whether FPS normalization may be needed.
  - hideband-transcode: Applies band-hide (crop/vflip/gblur/overlay), optionally normalizes FPS to the nearest allowed site rate, and transcodes to MP4 (H.264) or MOV (ProRes) based on pixel format. Validates delivery QC, splits outputs into out/hideband_transcode/passed or failed, writes delivery_report.json, and appends entries to out/_stockflow_manifest.json.
  - embed-meta: Embeds Title/Description/Keywords into media via exiftool using cache or LLM-backed generation for cache misses. Persists new metadata to csv/master.csv and logs LLM usage (csv/_llm_usage.jsonl).
  - qc: Probes each file with ffprobe and validates against per-site rules in qc/specs.yaml. Writes qc_report.json and prints a console summary.
  - process: One-off transcode for files from an input dir with optional gating against qc_report.json. Appends outputs to the manifest.
  - make-csv: Builds provider CSVs (master, Shutterstock, Pond5, Adobe) from media using cached or LLM-generated metadata.
  - upload: Uploads .mp4/.mov under out/hideband_transcode/passed to pond5 (FTP), shutterstock (FTPS), or adobe (SFTP). Optionally enforces that outputs exist and are qc_passed in the manifest. Writes per-site upload reports under out/ready-for-upload/auto-uploaded/.
  - portals, debug-ftplist: Opens contributor portals; performs credential-checked remote LIST for debugging.

Core modules
- QC
  - probe: Runs ffprobe and returns a typed VideoProbe (container, size, width/height, fps, duration, codec, pix_fmt, audio).
  - validate: Checks duration window, size limits, container allow/reject, FPS tolerance, and basic audio rules per site (specs.yaml). Includes qc_all for directory scans.
  - preflight: Common-duration and configurable resolution-floor checks for init-qc.
  - gate: Read/write qc_report.json, build pass indices, and gate processing against reports.
- Transform
  - bandhide: Computes band height and applies crop/vflip/gblur/overlay filtergraph. CFR is enforced by output options (-r ... -vsync cfr).
  - transcode: Encodes MP4 (CRF or CBR with HRD signaling) or ProRes MOV; optional audio handling; integrates band-hide and FPS normalization.
- Decision
  - richness: Chooses mp4 vs mov based on pixel format bit depth and chroma (configurable thresholds).
- Metadata
  - parse_name/generate: Subject extraction and deterministic baseline texts.
  - llm: Strict, filename-bound prompting; injects fixed location and optional date; ranks keywords; cost accounting; API key from env or Keychain.
  - embed: Embeds metadata and updates csv/master.csv; logs LLM usage.
  - cache: Loads/updates master.csv as canonical metadata cache.
  - csv writers: Provider-specific builders with keyword cleaning/ranking and caps; generic CSV writer.
- Upload
  - common: Credential resolution order (CLI overrides > Keychain > env), masking, trimming.
  - per-site: Shutterstock FTPS, Pond5 FTP, Adobe SFTP with progress, per-file result collection, and clean shutdown.
- Manifest
  - Append with de-dup by output; build index of qc_passed by absolute path for upload gating.

Configuration and specs
- Config is loaded from YAML (see stockflow/config/config_master.yaml example). Key sections: encode (ffmpeg/ffprobe paths, band-hide and encoding knobs, FPS normalization), decision (container choice thresholds), preflight (resolution floor), portals, paths (in/out/csv roots), csv (keyword caps, categories), sites (host/port/credentials), llm (provider/model/pricing).
- qc/specs.yaml encodes documented site constraints for duration, size, containers, allowed FPS, and basic audio guidance.

Unit test coverage (summary)
- QC, decision, transcode
  - Probing and site validation pass for generated MP4 clips across Adobe/Shutterstock/Pond5.
  - MP4 CBR targets are hit within tolerance by resolution bucket.
  - Container decision selects MP4 for 8-bit 4:2:0 and MOV/ProRes for 10-bit 4:2:2 when supported.
  - Band-hide modifies only the bottom band; ProRes encode properties validated (if encoder available).
- Metadata and CSV
  - Filename parsing and subject extraction.
  - Baseline text generation produces expected fields and keywords.
  - Pond5 and Shutterstock CSV writers emit correct headers and row formats.
- Upload layer
  - Shutterstock FTPS and Adobe SFTP upload functions collect per-file success/failure via dummy transports.
  - CLI upload writes per-site JSON reports for empty sources and for stubbed uploader results.
  - Optional live auth smoke tests for Pond5/Shutterstock exist but are skipped by default.
- Utilities and scripts
  - scripts/reset.sh clears prior last-run/out/init_qc/passed contents.

Out of default test scope / lightly covered
- End-to-end CLI flows (init-qc -> hideband-transcode -> embed-meta -> make-csv) are not executed as a single test.
- Manifest read/write and manifest-driven gating aren't unit-tested directly.
- qc_all orchestration is exercised via components but not directly asserted.
- portals (browser open) behavior isn't tested.
- embed-meta/ExifTool side effects aren't unit-tested.
- Live network integrations are guarded by STOCKFLOW_NETWORK_TESTS=1 and require configured Keychain credentials.

Test-time external dependencies
- ffmpeg and ffprobe are required; ProRes encoder is optional.
- Keyring is optional; when present, enables real auth tests.
- OpenAI/LLM code paths are exercised by CLI features but not executed in unit tests.

About

Automated pipeline for prep and upload of stock footage to Shutterstock Pond5 and partial support for Adobe - includes QC, transform, metadata, uploading and reporting

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors