Skill Being Reviewed
Skill name: model-supply-chain
Skill path: skills/ai-security/model-supply-chain/
False Positive Analysis
Benign-looking release flow that can be over-credited:
model:
source: huggingface://org/model
revision: main
evaluation:
report: eval-results.json
approval:
ticket: AI-1234
deploy:
image: registry.example.com/inference:latest
model_uri: s3://models/org/model/latest
Why this is a false positive:
The flow has a model source, evaluation report, approval ticket, and deployment target, but it does not prove that the evaluated artifact is the same artifact deployed to production. Mutable refs such as main, latest, and unpinned S3 paths can change between evaluation and release. A review can credit provenance, model card, and evaluation controls while missing model substitution during promotion.
Coverage Gaps
Missed variant 1: evaluated revision differs from deployed revision
The evaluation report references a model repo name, while deployment pulls the current default branch or latest object path.
Missed variant 2: approval is not bound to artifact identity
The approval ticket names the model family but not the exact digest, commit SHA, artifact URI version, model card version, evaluation run ID, or signing attestation.
Missed variant 3: rollback uses unverified artifacts
Rollback points to a previous alias or bucket prefix without verifying its checksum, signature, evaluation status, and known vulnerability/backdoor test status.
Edge Cases
- API-only hosted models may not expose weight digests, but still need provider version IDs, deployment IDs, or immutable snapshot references.
- LoRA/adapter releases must bind base model digest and adapter digest together.
- Canary and shadow deployments should record the exact artifact identity and evaluation gates used for each environment.
Remediation Quality
Comparison to Other Tools
| Tool |
Catches this? |
Notes |
| Model registry |
Partial |
Can track versions, but reviewers must verify promotion policies and deployment references. |
| MLflow / experiment tracker |
Partial |
Records runs and metrics, but may not enforce production artifact digest binding. |
| SLSA / attestations |
Partial |
Provides provenance when generated and verified; release policy must consume it. |
Overall Assessment
Strengths: Strong coverage of model provenance, training pipeline integrity, model card completeness, and backdoor detection.
Needs improvement: Add release/promotion evidence so reviewers can prove that the model approved after evaluation is the immutable model deployed and rolled back in production.
Priority recommendations:
- Add a model promotion gate checklist under provenance or fine-tuning pipeline review.
- Require artifact digest/revision, evaluation run ID, model card version, approval ID, and deploy manifest to match.
- Add output fields for promotion status, mutable references, rollback verification, and environment-specific artifact identity.
Sources Checked
Bounty Info
Skill Being Reviewed
Skill name:
model-supply-chainSkill path:
skills/ai-security/model-supply-chain/False Positive Analysis
Benign-looking release flow that can be over-credited:
Why this is a false positive:
The flow has a model source, evaluation report, approval ticket, and deployment target, but it does not prove that the evaluated artifact is the same artifact deployed to production. Mutable refs such as
main,latest, and unpinned S3 paths can change between evaluation and release. A review can credit provenance, model card, and evaluation controls while missing model substitution during promotion.Coverage Gaps
Missed variant 1: evaluated revision differs from deployed revision
The evaluation report references a model repo name, while deployment pulls the current default branch or latest object path.
Missed variant 2: approval is not bound to artifact identity
The approval ticket names the model family but not the exact digest, commit SHA, artifact URI version, model card version, evaluation run ID, or signing attestation.
Missed variant 3: rollback uses unverified artifacts
Rollback points to a previous alias or bucket prefix without verifying its checksum, signature, evaluation status, and known vulnerability/backdoor test status.
Edge Cases
Remediation Quality
Comparison to Other Tools
Overall Assessment
Strengths: Strong coverage of model provenance, training pipeline integrity, model card completeness, and backdoor detection.
Needs improvement: Add release/promotion evidence so reviewers can prove that the model approved after evaluation is the immutable model deployed and rolled back in production.
Priority recommendations:
Sources Checked
Bounty Info