examples: add async_openai_completion + point README at it by amavashev · Pull Request #37 · runcycles/cycles-client-rust

amavashev · 2026-05-16T15:32:50Z

Why

The existing five examples (`basic_usage`, `error_handling`, `guard_usage`, `streaming_usage`, `with_cycles_usage`) all use an inline `call_llm` stub. That keeps the runcycles API surface clean to read, but it leaves evaluators without a working starting point for a real LLM call. They have to invent the async-openai wiring themselves — token extraction from `response.usage`, `caps.max_tokens` application, etc.

Background analysis on the docs side flagged this as the most likely cause of the 13:1 clone-to-install ratio on this repo (907 GitHub clones, 67 crates.io installs as of 2026-05-16, vs ~1.2:1 for the TypeScript client). The hypothesis is that lots of evaluators clone, look at `examples/`, see only stub-based code, and bounce before `cargo add runcycles`.

This PR closes that gap on the repo side. (Docs side: cycles-docs PR #659, which lands today, adds `/configuration/rust-client-configuration-reference.md` + `/how-to/integrating-cycles-with-async-openai.md` covering streaming, error-aware ReservationGuard patterns, and token-to-microcents conversion.)

What's in this PR

New example:

`examples/async_openai_completion.rs` — 60-line `with_cycles` example. Reserves `Amount::tokens(1_500)`, narrows `max_tokens` from `caps.max_tokens` when ALLOW_WITH_CAPS, fires `openai.chat().create()`, extracts `response.usage.total_tokens`, commits the actual.

Cargo.toml:

Adds `async-openai = "0.30"` to `[dev-dependencies]`. Dev-only — does not affect downstream consumers' dependency graph. Pulls into CI builds.

README.md:

Adds a callout in the Quick Start section pointing at the new example. Keeps the existing minimal `call_llm("Hello")` snippet (good for skimming the API surface) but signposts the real-LLM example directly below.

Verified locally

```
cargo check --example async_openai_completion --all-features ✓
cargo build --example async_openai_completion --all-features ✓
cargo clippy --example async_openai_completion --all-features -- -D warnings ✓
cargo fmt --check ✓
```

CI will re-run these on the standard `runcycles/.github/.github/workflows/ci-rust.yml@v1` matrix (stable + 1.88 with `--all-features`).

Out of scope

Two follow-up examples that would close the parallel gaps but each deserves its own PR + iteration:

`examples/axum_middleware.rs` — Axum web-framework integration. Axum middleware uses `tower::Service` and the version surface there is intricate enough that I want to iterate it separately.
`examples/async_openai_streaming.rs` — Streaming chat completions. The streaming flow uses `ReservationGuard` rather than `with_cycles` (token totals arrive only after the stream ends), plus the `stream_options.include_usage = true` gotcha. Worth a separate file so the with-cycles example stays minimal.

Test plan

CI green
`cargo run --example async_openai_completion` works end-to-end with a real `OPENAI_API_KEY` and a running Cycles server
README rendering shows the new callout in a sensible position

The existing /examples set (basic_usage, error_handling, guard_usage, streaming_usage, with_cycles_usage) all use an inline `call_llm` stub. That demonstrates the runcycles API surface cleanly, but it leaves evaluators without a working starting point for a real LLM call — they have to invent the async-openai wiring themselves, including token extraction from response.usage and cap-to-max_tokens application. This was flagged as the most likely cause of the 13:1 clone-to-install ratio on cycles-client-rust: lots of evaluators clone, look at the examples, and bounce because nothing shows the real-LLM composition. This PR fills that gap: - Adds `examples/async_openai_completion.rs` — a 60-line `with_cycles` example that wires async-openai 0.30.x against runcycles. Reserves `Amount::tokens(1_500)`, applies `caps.max_tokens` from ALLOW_WITH_CAPS, fires the chat completion, extracts `response.usage.total_tokens`, commits the actual. - Adds `async-openai = "0.30"` to dev-dependencies (dev-only — does not affect downstream consumers, but pulls into CI builds). - Updates the README's Quick Start section with a callout pointing at the new example for users who want a real LLM call rather than the `call_llm` placeholder. Verified locally: cargo check --example async_openai_completion --all-features ✓ cargo build --example async_openai_completion --all-features ✓ cargo clippy --example async_openai_completion --all-features -- -D warnings ✓ cargo fmt --check ✓ Companion docs PR in cycles-docs (PR #659) covers the same composition in detail (streaming via ReservationGuard, error-aware patterns, token-to-microcents conversion, gotchas). Out of scope (not in this PR): - `examples/axum_middleware.rs` — would close the parallel gap for the Rust web-framework audience. Axum middleware needs the tower::Service trait surface, which is enough complexity that it warrants its own PR with its own iteration cycle. - A streaming `async_openai_streaming.rs` example — the streaming pattern uses ReservationGuard (not with_cycles, since token totals arrive only after the stream ends). Worth a separate example for the same reason.

Codex flagged design/idiom issues that cargo check / clippy / fmt can't see — exactly the value of layered review on top of compile checks. All compile-pass, all human-readable now. Apply/skip tally: 8 applied, 0 pushed back. Applied: - **Silent under-billing fixed.** The original `response.usage.map(...).unwrap_or(0)` would commit `Amount::tokens(0)` when a provider omitted `usage` — silent under-billing on a successful-looking response, which is exactly wrong for a teaching example about budget governance. Now `response.usage.ok_or(...)?` so the closure errors and `with_cycles` auto-releases the reservation. Production code that needs a fallback must opt into one explicitly. - **Empty/missing content now errors.** `response.choices.first().and_then(|c| c.message.content.clone()).unwrap_or_default()` silently returned an empty string. Now uses `.ok_or(...)?` so a response with no choices or no content fails loud and the reservation releases. - **Zero/negative cap now errors.** `cap.max(0)` would have sent `max_tokens=0` to OpenAI (which OpenAI rejects, after we've already spent the request budget). Now `u32::try_from(cap)` errors on negatives, and a zero cap errors explicitly. Both before any network call to OpenAI. - **`.max_tokens()` → `.max_completion_tokens()`.** OpenAI deprecated `max_tokens` for chat completions in favor of `max_completion_tokens`. async-openai 0.30.1 supports both; the example uses the current name. - **`u.total_tokens as i64` → `i64::from(u.total_tokens)`.** More idiomatic, no `as` cast. - **Module doc comment accuracy:** - Removed the inaccurate "CYCLES_BASE_URL env var override" claim — the code hardcodes the URL. - Added a "Loud-failure stance" section explicitly explaining the new error-on-edge-cases design. - Softened the absolute "streaming uses ReservationGuard instead of with_cycles" framing — the guard is the right primitive when chunks are forwarded while the reservation remains open. - Removed editorial "the example most users actually want" tone. - **README callout under-stated requirements.** Now explicitly mentions: reachable Cycles server, tenant API key, and a TOKENS- denominated budget at the scope being reserved against — not just `OPENAI_API_KEY`. - **README claim softened.** Dropped "so the budget reflects actual spend" framing (which was directionally true only when usage is present) to "threading the response's usage.total_tokens back into the commit" — accurate regardless of failure path. Verified locally: cargo check --example async_openai_completion --all-features ✓ cargo clippy --example async_openai_completion --all-features -- -D warnings ✓ cargo fmt --check ✓

CI's `cargo audit --deny warnings` step failed on this PR because the 0.30.x line of async-openai pulled in two unmaintained transitive deps that the org-wide audit job treats as errors: - backoff 0.4.0 (RUSTSEC-2025-0012, unmaintained) - instant 0.1.13 (RUSTSEC-2024-0384, unmaintained, via backoff) async-openai 0.31+ replaced its retry stack with `tower`, dropping the backoff dependency entirely. Bumping to 0.38 (current latest) clears both advisories without needing audit-ignore configuration. API changes between 0.30 and 0.38 are minor for the example's usage: - `Client` is now gated behind the per-API features (the crate split its surface in 0.31). Enabled `chat-completion` for the example. - Chat-completion types moved from `async_openai::types::` to `async_openai::types::chat::`. Updated the import. - `default-features = false, features = ["chat-completion", "rustls"]` keeps the dev-dep set minimal — no unused-feature surface, and rustls matches the rest of the runcycles crate's TLS story. The example still uses the same `with_cycles` flow, the same `response.usage.total_tokens` extraction, and the same loud-failure patterns from the codex-round-1 fixes. Only the imports and Cargo.toml changed. Verified locally: cargo check --example async_openai_completion --all-features ✓ cargo clippy --example async_openai_completion --all-features -- -D warnings ✓ cargo fmt --check ✓ cargo tree -i backoff → "did not match any packages" ✓ cargo tree -i instant → "did not match any packages" ✓ Note: cycles-docs PR #659 currently pins async-openai 0.30.x in its how-to doc; that doc needs a parallel bump to 0.38 + types::chat:: in a separate commit on that PR.

amavashev added 3 commits May 16, 2026 11:32

amavashev merged commit ca15f8e into main May 16, 2026
8 checks passed

amavashev deleted the examples/async-openai-completion branch May 16, 2026 16:03

amavashev mentioned this pull request May 16, 2026

docs(rust): sync async-openai version pin to 0.38 runcycles/cycles-docs#660

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples: add async_openai_completion + point README at it#37

examples: add async_openai_completion + point README at it#37
amavashev merged 3 commits into
mainfrom
examples/async-openai-completion

amavashev commented May 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amavashev commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What's in this PR

Verified locally

Out of scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

amavashev commented May 16, 2026 •

edited

Loading