Skip to content

Add W&B Inference prefix caching docs#2372

Open
corbt wants to merge 2 commits intomainfrom
codex/prefix-caching-docs
Open

Add W&B Inference prefix caching docs#2372
corbt wants to merge 2 commits intomainfrom
codex/prefix-caching-docs

Conversation

@corbt
Copy link
Copy Markdown

@corbt corbt commented Mar 26, 2026

Summary

Add a new W&B Inference docs page for prefix caching and cache isolation, and add it to the English Inference "Response Settings" nav.

Motivation

Motivating Slack thread:
https://weightsandbiases.slack.com/archives/C08RU04P36G/p1774553788470369

A customer is worried about security on a multi-tenant installation, and we want to show that we take that concern seriously. This page explains how prefix caching works at a high level and how cache_salt can be used to isolate cache reuse across trust boundaries.

Notes

  • This keeps the existing chat-completions page simple instead of documenting one advanced parameter there while more basic request parameters are still undocumented.
  • I limited this change to English content plus the English nav entry. I did not edit JA or KO content.
  • I verified the behavior live against https://api.inference.wandb.ai/v1/chat/completions on moonshotai/Kimi-K2.5:
    • same prompt + same cache_salt reused almost the entire prompt cache
    • same prompt + different cache_salt forced a cold miss
    • empty cache_salt returned a 400

@corbt corbt requested a review from a team as a code owner March 26, 2026 21:16
@mintlify
Copy link
Copy Markdown
Contributor

mintlify bot commented Mar 26, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
wandb 🟢 Ready View Preview Mar 26, 2026, 9:20 PM

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

📚 Mintlify Preview Links

🔗 View Full Preview

✨ Added (1 total)

📄 Pages (1)

File Preview
inference/response-settings/prefix-caching.mdx Prefix Caching

📝 Changed (1 total)

⚙️ Other (1)
File
docs.json

🤖 Generated automatically when Mintlify deployment succeeds
📍 Deployment: 31cbde1 at 2026-04-01 20:30:08 UTC

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

🔗 Link Checker Results

All links are valid!

No broken links were detected.

Checked against: https://wb-21fd5541-codex-prefix-caching-docs.mintlify.app

@mdlinville
Copy link
Copy Markdown
Contributor

Tagging @jamie-rasmussen for technical review to start with, since he was involved in the Slack thread. 🙏

"content": "Summarize this document in one sentence: <long shared prefix here>"
},
],
cache_salt="tenant-a-user-123-secret",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corbt I don't think this works - gives you TypeError: Completions.create() got an unexpected keyword argument 'cache_salt'

I think you can replace with something like:

    extra_body={
        "cache_salt": "tenant-a-user-123-secret",
    },

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants