A Rust library and CLI for downloading and inspecting HuggingFace models. Multi-connection parallel downloads, file filtering, checksum verification, retry — plus safetensors header inspection and tensor layout comparison between models, all without downloading weight data.
- Install
- Commands
- Try it
- Inspect & compare
- Disk usage
- Library quick start
- Documentation
- Used by
- License
- Development
cargo install hf-fetch-model --features cli| Command | Description |
|---|---|
hf-fm <REPO_ID> (default) |
Download a model (multi-connection, auto-tuned) |
hf-fm cache clean-partial |
Remove .chunked.part files from interrupted downloads |
hf-fm cache delete <REPO_ID|N> |
Delete a cached model |
hf-fm cache path <REPO_ID|N> |
Print snapshot directory path (for scripting) |
hf-fm diff <REPO_A> <REPO_B> |
Compare tensor layouts between two models |
hf-fm discover |
Find new model families on the Hub |
hf-fm download-file <REPO_ID> <FILE> |
Download a single file (or glob pattern) |
hf-fm du [REPO_ID|N] |
Show cache disk usage (by name or # index) |
hf-fm inspect <REPO_ID> [FILE] |
Inspect safetensors headers (tensor names, shapes, dtypes) without downloading weights |
hf-fm list-families |
List model families in local cache |
hf-fm list-files <REPO_ID> |
List remote files (sizes, SHA256) without downloading |
hf-fm search <QUERY> |
Search the HuggingFace Hub for models |
hf-fm status [REPO_ID] |
Show download status (complete / partial / missing) |
See CLI Reference for all flags and output examples.
$ hf-fm search mistral,3B,instruct
Models matching "mistral,3B,instruct" (by downloads):
hf-fm mistralai/Ministral-3-3B-Instruct-2512 (159,700 downloads)
hf-fm mistralai/Ministral-3-3B-Instruct-2512-BF16 (62,600 downloads)
hf-fm mistralai/Ministral-3-3B-Instruct-2512-GGUF (32,700 downloads)
...
$ hf-fm search llama --tag gguf --limit 3
Models matching "llama" (by downloads):
hf-fm bartowski/Llama-3.2-3B-Instruct-GGUF (489,856 downloads) [text-generation]
hf-fm bartowski/Meta-Llama-3.1-8B-Instruct-GGUF (237,791 downloads) [text-generation]
hf-fm MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF (184,847 downloads) [text-generation]
$ hf-fm search mistralai/Ministral-3-3B-Instruct-2512 --exact
Exact match:
hf-fm mistralai/Ministral-3-3B-Instruct-2512 (159,700 downloads)
License: apache-2.0
Pipeline: text-generation
Library: vllm
Languages: en, fr, es, de, it, pt, nl, zh, ja, ko, ar
$ hf-fm list-files mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
File Size SHA256
model-00001-of-00002.safetensors 3.68 GiB a1b2c3d4e5f6
model-00002-of-00002.safetensors 2.88 GiB f6e5d4c3b2a1
config.json 856 B —
...
7 files, 6.57 GiB total
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --dry-run
Repo: mistralai/Ministral-3-3B-Instruct-2512
Revision: main
File Size Status
model-00001-of-00002.safetensors 3.68 GiB to download
model-00002-of-00002.safetensors 2.88 GiB to download
...
Total: 6.57 GiB (7 files, 0 cached, 7 to download)
Recommended config:
concurrency: 2
connections/file: 8
chunk threshold: 100 MiB
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors
Downloaded to: ~/.cache/huggingface/hub/models--mistralai--Ministral-3-3B.../snapshots/...
6.57 GiB in 18.2s (369.1 MiB/s)
# Download to flat layout (files directly in ./models/)
$ hf-fm mistralai/Ministral-3-3B-Instruct-2512 --preset safetensors --flat --output-dir ./models
# Download sharded PyTorch files by glob
$ hf-fm download-file org/model "pytorch_model-*.bin"
$ hf-fm inspect EleutherAI/pythia-1.4b model.safetensors --cached --filter "layers.0."
Repo: EleutherAI/pythia-1.4b
File: model.safetensors
Source: cached
Tensor Dtype Shape Size Params
gpt_neox.layers.0.attention.dense.weight F16 [2048, 2048] 8.00 MiB 4.2M
gpt_neox.layers.0.mlp.dense_h_to_4h.weight F16 [8192, 2048] 32.00 MiB 16.8M
...
────────────────────────────────────────────────────────────────────────────────────────────────
15/364 tensors, 54.6M/1.52B params (filter: "layers.0.")
$ hf-fm inspect google/gemma-4-E2B-it model.safetensors --tree --filter "embed"
Repo: google/gemma-4-E2B-it
File: model.safetensors
Source: remote (2 HTTP requests)
└── model.
├── embed_audio.embedding_projection.weight BF16 [1536, 1536] 4.50 MiB
├── embed_vision.embedding_projection.weight BF16 [1536, 768] 2.25 MiB
├── language_model.
│ ├── embed_tokens.weight BF16 [262144, 1536] 768.00 MiB
│ └── embed_tokens_per_layer.weight BF16 [262144, 8960] 4.38 GiB
└── vision_tower.patch_embedder.
├── input_proj.weight BF16 [768, 768] 1.12 MiB
└── position_embedding_table BF16 [2, 10240, 768] 30.00 MiB
6/2011 tensors, 2.77B/5.12B params (filter: "embed")
$ hf-fm diff RedHatAI/Llama-3.2-1B-Instruct-FP8 casperhansen/llama-3.2-1b-instruct-awq --cached --summary
A: RedHatAI/Llama-3.2-1B-Instruct-FP8
B: casperhansen/llama-3.2-1b-instruct-awq
──────────────────────────────────────────────────────────────────────────────────────────────
A: 371 tensors | B: 370 tensors | only-A: 337 | only-B: 336 | differ: 34 | match: 0
Inspect reads tensor metadata via HTTP Range requests (2 requests per file) — no weight data downloaded. The --tree flag shows the hierarchical namespace with numeric sibling groups auto-collapsed to [0..N] for structural discovery. Diff compares tensor names, dtypes, and shapes between any two models (remote or cached).
$ hf-fm du
# SIZE REPO FILES
1 5.10 GiB google/gemma-2-2b-it 8
2 2.80 GiB EleutherAI/pythia-1.4b 12 ●
3 1.20 GiB google/gemma-scope-2b-pt-res 3
─────────────────────────────────────────────────────────────────────────────
9.10 GiB total (3 repos, 23 files)
● = partial downloads
$ hf-fm du 2
EleutherAI/pythia-1.4b:
# SIZE FILE
1 2.50 GiB model-00001-of-00002.safetensors
2 0.26 GiB model-00002-of-00002.safetensors
...
──────────────────────────────────────────────────────────────────
2.80 GiB total (12 files)
● partial downloads — run `hf-fm status EleutherAI/pythia-1.4b` for details
$ hf-fm du --age
# SIZE REPO FILES AGE
1 5.10 GiB google/gemma-2-2b-it 8 2 days ago
2 2.80 GiB EleutherAI/pythia-1.4b 12 45 days ago ●
3 1.20 GiB google/gemma-scope-2b-pt-res 3 3 months ago
─────────────────────────────────────────────────────────────────────────────────────────
9.10 GiB total (3 repos, 23 files)
● = partial downloads
$ hf-fm cache path google/gemma-2-2b-it
/home/user/.cache/huggingface/hub/models--google--gemma-2-2b-it/snapshots/abc1234
let outcome = hf_fetch_model::download(
"google/gemma-2-2b-it".to_owned(),
).await?;
println!("Model at: {}", outcome.inner().display());Filter, progress, auth, and more via the builder — see Configuration.
| Topic | |
|---|---|
| CLI Reference | All subcommands, flags, and output examples |
| FAQ | Common questions — installation, auth, cache location, discovery, errors |
| Search | Comma filtering, --exact, model card metadata |
| Configuration | Builder API, presets, progress callbacks |
| Architecture | How hf-fetch-model relates to hf-hub and candle-mi |
| Diagnostics | --verbose output, tracing setup for library users |
| Upstream differences | Where hf-fetch-model diverges from Python huggingface_hub/hf_transfer |
| Candle example | Inspect tensor layouts before downloading — for candle users |
| Changelog | Release history and migration notes |
- candle-mi — Mechanistic interpretability toolkit for language models
Licensed under either of Apache License, Version 2.0 or MIT License at your option.
- Exclusively developed with Claude Code (dev) and Augment Code (review)
- Git workflow managed with Fork
- All code follows CONVENTIONS.md, derived from Amphigraphic-Strict's Grit — a strict Rust subset designed to improve AI coding accuracy.