docs(rust): sync async-openai how-to with the loud-failure example by amavashev · Pull Request #661 · runcycles/cycles-docs

amavashev · 2026-05-16T16:53:38Z

Why

When the cycles-client-rust example (PR #37, merged) went through codex code review, codex caught silent-under-billing bugs in patterns like `response.usage.map(...).unwrap_or(0)`. Those fixes landed in the example. They didn't land in the cycles-docs how-to — PR #659 merged the doc with the original buggy patterns intact.

The result: the shipped doc on `main` and the shipped example in cycles-client-rust were teaching opposite things on the same edge cases.

Edge case	Doc (on main, pre-PR)	Example (on main, post-PR-#37)
Missing `response.usage`	`.unwrap_or(0)` → commit zero	`.ok_or(...)?` → release + error
Missing `message.content`	`.unwrap_or_default()` → empty string success	`.ok_or(...)?` → release + error
Non-positive `caps.max_tokens`	`(cap as u32)` wraps; `max_completion_tokens=0` sent	`u32::try_from(...)?` + zero-check + error
OpenAI field	`.max_tokens(...)` (deprecated upstream)	`.max_completion_tokens(...)`
Token conversion	`as i64`	`i64::from(...)` / `i64::try_from(...)`

Plus an internal contradiction in the doc itself: gotcha #2 told readers "Treat `None` as estimate locally rather than no spend" while the code blocks unwrapped to zero — which is no spend. Description and code disagreed.

What's in this PR

Brings the doc back in sync with the example and fixes the internal contradictions.

Code changes across all four code blocks (basic, ALLOW_WITH_CAPS, streaming, error-aware)

`.max_tokens(...)` → `.max_completion_tokens(...)`
Missing `response.usage`: `.unwrap_or(0)` → `.ok_or(...)?` in non-streaming examples. Streaming keeps the option of a tokenizer fallback (legitimate exception — re-issuing a consumed stream is expensive), but the default code path now releases + errors; the tokenizer fallback is shown in a dedicated subsection.
Missing `message.content`: `.unwrap_or_default()` → `.ok_or(...)?` everywhere
`(cap as u32)` → `u32::try_from(cap)?` with explicit zero-check
`usage.total_tokens as i64` → `i64::from(usage.total_tokens)`
`microcents as i64` → `i64::try_from(microcents)?`
`ChatCompletionStreamOptions`: fixed to use `Option` field shape that async-openai 0.38.x actually defines (`include_usage: Some(true), include_obfuscation: None`)
Streaming end-of-stream: added an empty-`full_text` release-and-bail check before the usage check, so the loud-failure stance applies to the "no content but has usage" edge case too
Streaming zero-cap: `guard.release(...).await?` → `let _ = guard.release(...).await;` so the typed zero-cap error wins over a release-side error

Tokenizer fallback (streaming only)

The fictional `estimate_tokens_with_tiktoken(...)` call became a real, copy-pasteable helper in a new "Optional: tokenizer fallback for missing-usage chunks" subsection. Uses `tiktoken_rs::o200k_base()` with `i64::try_from(...)` conversions (matches gotcha #6).

Prose changes

Added a "Loud-failure stance" callout in the intro that explicitly references the shipped example
Rewrote gotchas Standardize Cycles scope terminology and clarify protocol behavior #2 and Add comprehensive documentation for Cycles protocol and client #3 to match the new code (no longer contradict it):
- Standardize Cycles scope terminology and clarify protocol behavior #2 now distinguishes non-streaming (loud failure) from streaming (tokenizer fallback)
- Add comprehensive documentation for Cycles protocol and client #3 says "fail loud and release" instead of "commit zero or commit the prompt-token cost only"
Added gotcha Add comprehensive comparison guide for Cycles vs alternatives #6 covering `as u32` / `as i64` on values from external sources
Cargo.toml block: added `thiserror = "2"` (was referenced by the `#[derive(thiserror::Error)]` in the error-aware example but missing from the deps list)
"Other Rust LLM clients" adaptation guidance updated to teach the `ok_or(...)?` + `i64::from(...)` pattern for non-OpenAI providers too

Review process

Codex round 1 → REVISE-MINOR (6 findings, all applied):
- `ChatCompletionStreamOptions` field shape against actual async-openai 0.38.2 surface
- Missing `thiserror` in Cargo.toml block
- Streaming `release(...).await?` vs `let _ = ... .await` swallow order
- End-of-stream empty-content hole
- Stale `response.usage.map(...).unwrap_or(0)` prose still in "Other Rust LLM clients"
- Fictional `estimate_tokens(...)` call — made real
- Also caught: tiktoken stub itself used `as i64` casts (the same pattern gotcha Add comprehensive comparison guide for Cycles vs alternatives #6 warns against); switched to `i64::try_from(...)`
Codex round 2 → SHIP

Verified against runcycles 0.2.4 source, async-openai 0.38.2 docs.rs, tiktoken-rs latest docs.rs, and the shipped `cycles-client-rust/examples/async_openai_completion.rs` on main.

Test plan

`npm run build` succeeds; no broken-link warnings
Rendered page shows the new "Loud-failure stance" callout under "What you get"
All four code blocks use `.max_completion_tokens(...)` and `ok_or(...)?` patterns
Streaming section's "Optional: tokenizer fallback" subsection renders correctly
Gotchas Standardize Cycles scope terminology and clarify protocol behavior #2 and Add comprehensive documentation for Cycles protocol and client #3 read consistently with the new code (no longer contradict)

When the cycles-client-rust example (PR #37) went through codex code review, codex caught silent-under-billing bugs in patterns like `response.usage.map(...).unwrap_or(0)` — we fixed those in the example but never backported the fixes to the cycles-docs how-to. PR #659 merged the doc with the buggy patterns intact. As a result, the shipped doc (on main) and the shipped example (in cycles-client-rust) were teaching opposite things: - Doc: `.unwrap_or(0)` on missing usage → silent commit zero - Example: `.ok_or(...)?` on missing usage → error, auto-release Plus an internal contradiction inside the doc itself: the "Common gotchas" section said "Treat None as estimate locally rather than no spend" while the code blocks above unwrap to zero. Description vs code disagreed. This PR brings the doc back in sync with the example and fixes the internal contradictions. Changes in `how-to/integrating-cycles-with-async-openai.md`: - `.max_tokens(...)` → `.max_completion_tokens(...)` in all four code blocks (basic, ALLOW_WITH_CAPS, streaming, error-aware). OpenAI deprecated `max_tokens` for chat completions in favor of `max_completion_tokens`; async-openai 0.38 supports both but the doc should teach the current name. - Missing `response.usage`: `.unwrap_or(0)` → `.ok_or(...)?` in non-streaming examples (basic, ALLOW_WITH_CAPS, error-aware, token-to-USD). Streaming kept the fallback pattern because re-issuing a consumed stream is expensive — that's the one legitimate place a tokenizer estimate beats erroring out. - Missing `response.choices[0].message.content`: `.unwrap_or_default()` → `.ok_or(...)?` everywhere. Committing on an empty reply is the silent under-billing pattern that codex caught in the example. - `(cap as u32)` → `u32::try_from(cap)?` with explicit zero-check. The old pattern would silently wrap on a negative `caps.max_tokens` and would send `max_completion_tokens=0` (rejected by OpenAI, but only after we've already paid the request cost) on a zero cap. - `usage.total_tokens as i64` → `i64::from(usage.total_tokens)` — no `as` cast risk, more idiomatic. - `microcents as i64` → `i64::try_from(microcents)?` — same. - Streaming example's `estimate_tokens_with_tiktoken(...)` was a fictional function reference (never defined or imported). Replaced with a real inline tokenizer stub showing how to plug in `tiktoken-rs`, with a clear comment that the alternative is releasing the guard and erroring. - Added a "Loud-failure stance" callout under "What you get" that spells out the design choice explicitly and points readers at the shipped `examples/async_openai_completion.rs` for the reference implementation. - Rewrote gotchas #2 and #3 to match the new code rather than contradict it: gotcha #2 now distinguishes non-streaming (loud failure) from streaming (tokenizer fallback); gotcha #3 says "fail loud and release" instead of "commit zero". Added gotcha #6 covering `as` casts on values from external sources. The async-openai 0.38 API and the runcycles 0.2.4 API are unchanged from the prior version-sync PR (#660); this PR only changes how the example code handles edge cases.

Apply/skip tally: 6 applied, 0 pushed back. Applied: - `ChatCompletionStreamOptions` fields are `Option<bool>` in async-openai 0.38.x, not raw `bool`. Updated to `{ include_usage: Some(true), include_obfuscation: None }` with a comment explaining the `None` default. - Cargo.toml block was missing `thiserror = "2"` even though the error-aware example uses `#[derive(thiserror::Error)]`. Added. - Streaming zero-cap path used `guard.release(...).await?` — if release itself errored, the caller would see the release error instead of the original zero-cap error. Switched to `let _ = guard.release(...).await;` so the typed zero-cap error wins. - Streaming end-of-stream had an "empty content, but has usage" hole — the example would commit on a stream that produced no text. Added an empty-`full_text` release-and-bail check before the usage check, so the loud-failure stance applies consistently. - "Other Rust LLM clients" section still showed the old `response.usage.map(|u| u.total_tokens as i64)` pattern in the adaptation guidance for non-OpenAI providers. Updated to the `ok_or(...)?` + `i64::from(...)` shape that matches the rest of the doc. - The `estimate_tokens()` function was referenced in the streaming example but only existed as a comment. Made it a real, copy-pasteable helper (uses `tiktoken-rs::o200k_base`) and moved the fallback into an explicit "Optional: tokenizer fallback for missing-usage chunks" subsection, framed as an alternative to the loud-failure default. Also caught during the same pass: - The tiktoken-rs stub itself used `as i64` casts on `usize` values — exactly the pattern the new gotcha #6 warns against. Switched to `i64::try_from(...)?` and changed the function signature to return `Result<i64, Box<dyn Error>>` so the conversion error has somewhere to go. Codex verified the runcycles + async-openai 0.38.2 API surface, the tiktoken-rs API names, and confirmed the shipped cycles-client-rust/examples/async_openai_completion.rs on main matches the doc's loud-failure stance.

Two findings from the fresh codex pass against main after PR #661 merged plus one follow-up after round 2: - `tiktoken-rs` version pin in the streaming tokenizer-fallback snippet was `"0.6"` while crates.io is at 0.11.0. The cited API (`o200k_base`, `encode_with_special_tokens`) is unchanged across that range, but the pin should match current. Bumped to `"0.11"`. - The error-aware `run_completion` example skipped cap handling entirely — it called `.max_completion_tokens(800u32)` without consulting `guard.caps()`. The intro's loud-failure stance claimed all examples on the page error out on non-positive `caps.max_tokens`, which was false for that example. Added the same `u32::try_from(cap)` + zero-check + release-and-error block used in the ALLOW_WITH_CAPS and streaming examples, with the error variants typed as `CompletionError::Cycles(CyclesError::Validation(...))` to flow through the function's typed error contract. - Codex round 2 then caught that the BASIC example also doesn't read caps — it takes `_ctx` to keep the minimum-viable composition compact. Rather than add cap handling and dilute that example's purpose, narrowed the intro claim: all four examples fail loud on missing `usage` and missing `content`, and the three cap-reading examples additionally fail on non-positive caps. The basic example explicitly ignores caps; production code should follow the capped pattern. Codex round 3 returned SHIP. Sources verified: runcycles 0.2.4 source, the shipped async_openai_completion.rs example on main, async-openai 0.38.2 docs.rs, tiktoken-rs 0.11.0 docs.rs.

amavashev added 2 commits May 16, 2026 12:41

amavashev merged commit 779c022 into main May 16, 2026
5 checks passed

amavashev deleted the docs/rust-async-openai-loud-failure-consistency branch May 16, 2026 23:49

amavashev restored the docs/rust-async-openai-loud-failure-consistency branch May 16, 2026 23:55

amavashev mentioned this pull request May 17, 2026

docs(rust): final-state cleanup — tiktoken pin + error-aware cap handling #667

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(rust): sync async-openai how-to with the loud-failure example#661

docs(rust): sync async-openai how-to with the loud-failure example#661
amavashev merged 2 commits into
mainfrom
docs/rust-async-openai-loud-failure-consistency

amavashev commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amavashev commented May 16, 2026

Why

What's in this PR

Code changes across all four code blocks (basic, ALLOW_WITH_CAPS, streaming, error-aware)

Tokenizer fallback (streaming only)

Prose changes

Review process

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant