Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 4 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,10 +165,7 @@ implemented vs aspirational.
│ ├── requirements.txt
│ ├── README.md <- HF Spaces front-matter
│ └── tests/
├── docs/ <- architecture notes + run journals
│ ├── architecture.md
│ ├── schema.md
│ └── journal/
├── docs/ <- architecture.md, schema.md, faq.md, journal/
└── .github/workflows/ <- ci-gate (test + lint + scan + dry-run-publish
+ schema-validate via paths-filter),
release-please, deploy-space
Expand Down Expand Up @@ -269,8 +266,9 @@ Optional v1 fields (non-breaking additions): `seed`, `gen_kwargs`,
`kernel`. The CLI auto-detects all of these at publish time —
no hand-curation required.

See [`docs/schema.md`](docs/schema.md) for fields and
[`docs/schema-migration.md`](docs/schema-migration.md) for version upgrades.
See [`docs/schema.md`](docs/schema.md) for fields,
[`docs/schema-migration.md`](docs/schema-migration.md) for version upgrades,
and [`docs/faq.md`](docs/faq.md) for ops questions and troubleshooting.

### The publisher

Expand Down
93 changes: 93 additions & 0 deletions docs/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# FAQ & Troubleshooting

Common questions and operational issues when working with mlx-benchmarks.

## Envelope / Schema

### Q: Why do I get `additionalProperties: false` validation errors?

**A:** The envelope schema is strict—it rejects unknown fields. Check your
envelope object for typos in field names. The authoritative fields are in
[`schema.json`](../schema.json) and documented in [`docs/schema.md`](schema.md).

### Q: What fields are required vs. optional in v1?

**A:** Required: `schema_version`, `timestamp`, `git_sha`, `trigger`, `suite`,
`model`, `system` (with `os`, `chip`, `memory_gb`), and `results` (non-empty array).

Optional: `pr_number`, `model_revision`, `quantization`, `seed`, `gen_kwargs`, `errors`, etc.
The publisher auto-populates optional fields like `python_version` and `mlx_version`
via `detect_system()`. See [`docs/schema.md`](schema.md) for the full table.

### Q: When do I bump `schema_version`?

**A:** Only on breaking changes: removing/renaming a field, changing a type, or
tightening validation. Adding optional fields does not require a bump. See
[`docs/schema.md`](schema.md) for the versioning policy.

## Inference / Performance

### Q: My benchmark runs very slowly. Is that expected?

**A:** Slow runs usually indicate memory pressure, suboptimal config, or a process
leak. Diagnostic order:

1. **Memory pressure:** Compare `footprint -p $(pgrep -f "vllm-mlx")` to the
expected footprint for your model size.
2. **Config:** Check `--cache-memory-mb`, `--enable-prefix-cache`, and quantization
settings against your model.
3. **Accumulated allocations:** See [`docs/quick-reset.md`](quick-reset.md) to
reclaim memory without rebooting.

### Q: I get 404 or connection errors when running lm-eval. What is wrong?

**A:** The lm-eval configs expect `http://localhost:11434/v1/chat/completions`
(note the `/chat/completions` suffix). A common mistake is omitting that suffix.

Verify the endpoint is reachable: `curl http://localhost:11434/v1/models`

## Publishing

### Q: I published results and want to re-publish. What happens?

**A:** Filenames are content-addressed: `run-<timestamp>-<git_sha>-<suite>-<model_slug>.parquet`.
Re-publishing the same envelope overwrites the existing shard — no duplicates.
To correct a wrong model name, change the envelope so it produces a different filename.

### Q: HF upload fails with 401. What token scope do I need?

**A:** Your HuggingFace token must have **write** scope on the target dataset.
Generate one at <https://huggingface.co/settings/tokens> with "Write" permission,
then export it before running the publisher:

```bash
export HF_TOKEN=<your-hf-write-token>
```

### Q: I published successfully but the viewer does not show my result. Why?

**A:** The Space viewer caches the dataset. Wait a few minutes, then refresh.
Confirm the shard landed by checking the HF dataset repo directly.

## CI

### Q: Pip-audit or OSV scanner flags a dependency I want to suppress. How?

**A:** Add an ignore rule to [`osv-scanner.toml`](../osv-scanner.toml):

```toml
[[IgnoredVulns]]
id = "GHSA-xxxx-xxxx-xxxx"
ignoreUntil = 2026-12-31
reason = "<justification>"
```

`ignoreUntil` (an unquoted TOML date literal) forces the entry back through
review when the date passes — see [`osv-scanner.toml`](../osv-scanner.toml)
for live examples.

---

Still stuck? Check [`CLAUDE.md`](../CLAUDE.md) for architecture notes,
[`CONTRIBUTING.md`](../CONTRIBUTING.md) for the dev workflow, or
[open an issue](https://github.com/JacobPEvans/mlx-benchmarks/issues).
Loading