Skip to content

feat(agent): default to released HF model when no --ckpt is given#17

Open
evasnow1992 wants to merge 1 commit into
NVIDIA-BioNeMo:mainfrom
evasnow1992:evax/skill-default-released-model
Open

feat(agent): default to released HF model when no --ckpt is given#17
evasnow1992 wants to merge 1 commit into
NVIDIA-BioNeMo:mainfrom
evasnow1992:evax/skill-default-released-model

Conversation

@evasnow1992

Copy link
Copy Markdown
Collaborator

The finetune, embed, and continue-pretrain skills now offer to download the released pretrained hybrid model nvidia/NV-KERMT-70M-v2 from Hugging Face when the user provides no checkpoint. Consent-gated: an interactive prompt naming the NVIDIA Open Model License, or an explicit --pretrained-release flag for non-interactive runs (mutually exclusive with --ckpt). The download runs in-container via the new fetch_released_model.py and lands as the standard released-model bundle, which feeds the existing --ckpt / auto --vocab-dir flow unchanged (no runner changes).

  • agent/config/released_model.json: pin repo + revision + bundle filenames
  • agent/scripts/fetch_released_model.py: in-container snapshot download, idempotent reuse, JSON manifest, clear stale-image error
  • agent/scripts/kermt_container.sh: --model-dir RW mount + HF_TOKEN passthrough
  • environment.yml + Pipfile: add huggingface_hub (Pipfile.lock regen pending)
  • 3 SKILL.md: optional --ckpt + consent-gated resolve step + hard rule
  • agent/tests: mocked unit tests (always-run) + opt-in slow e2e
  • .gitignore: models/

The finetune, embed, and continue-pretrain skills now offer to download the
released pretrained hybrid model nvidia/NV-KERMT-70M-v2 from Hugging Face when
the user provides no checkpoint. Consent-gated: an interactive prompt naming
the NVIDIA Open Model License, or an explicit --pretrained-release flag for
non-interactive runs (mutually exclusive with --ckpt). The download runs
in-container via the new fetch_released_model.py and lands as the standard
released-model bundle, which feeds the existing --ckpt / auto --vocab-dir flow
unchanged (no runner changes).

- agent/config/released_model.json: pin repo + revision + bundle filenames
- agent/scripts/fetch_released_model.py: in-container snapshot download,
  idempotent reuse, JSON manifest, clear stale-image error
- agent/scripts/kermt_container.sh: --model-dir RW mount + HF_TOKEN passthrough
- environment.yml + Pipfile: add huggingface_hub (Pipfile.lock regen pending)
- 3 SKILL.md: optional --ckpt + consent-gated resolve step + hard rule
- agent/tests: mocked unit tests (always-run) + opt-in slow e2e
- .gitignore: models/

Signed-off-by: Eva Xue <evax@nvidia.com>
@evasnow1992 evasnow1992 requested a review from sveccham June 24, 2026 16:25

@sveccham sveccham left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants