Skip to content

Add new connectrpc-health crate#128

Open
torkelrogstad wants to merge 5 commits into
anthropics:mainfrom
torkelrogstad:2026-05-21-health
Open

Add new connectrpc-health crate#128
torkelrogstad wants to merge 5 commits into
anthropics:mainfrom
torkelrogstad:2026-05-21-health

Conversation

@torkelrogstad
Copy link
Copy Markdown

Inspired by connectrpc.com/grpchealth and the equivalent Tonic implementation. Breaking up into four different commits felt kind of awkward, but did it per CONTRIBUTING.md guidelines.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 21, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@torkelrogstad
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request May 21, 2026
torkelrogstad added a commit to LayerTwo-Labs/bip300301_enforcer that referenced this pull request May 21, 2026
Implemented via a new crate that is not yet merged into upstream: anthropics/connect-rust#128
torkelrogstad added a commit to LayerTwo-Labs/bip300301_enforcer that referenced this pull request May 21, 2026
Implemented via a new crate that is not yet merged into upstream: anthropics/connect-rust#128
Copy link
Copy Markdown
Collaborator

@iainmcgin iainmcgin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[claude code] Thanks — this is a strong contribution and we'd like to take it. We ran it through both a correctness/concurrency review and an API-ergonomics review; the hand-written code came out of both in very good shape: locks are block-scoped and never held across awaits, send_if_modified suppresses no-op wake-ups, dropping a Watch stream releases the subscriber (with a regression test for exactly that), mutex poisoning is recovered rather than propagated, peer-supplied service names are only ever used as lookup keys (no insertion path a caller can drive), and every documented behaviour of StaticChecker has a colocated test. The deliberate deviation on Watch for unknown services (NOT_FOUND rather than the spec's open stream with SERVICE_UNKNOWN, matching connect-go's grpchealth) is documented consistently in all four places it matters — that's fine by us. The buf/Taskfile wiring matches the other generated directories and the generated output follows the fully-qualified-paths convention.

The asks below are about the public API surface, since this becomes a published crate.

Before merge

  1. Feature-gate the client half. connectrpc-health/Cargo.toml:16 unconditionally enables connectrpc's server and client features, so a server that only exposes the health service still pulls in the whole client transport stack. Please add a client feature on this crate (off by default or default — your call) that forwards to connectrpc/client, and gate the HealthClient/HealthExt re-exports behind it.
  2. Make the common case one line and leave the user holding a usable handle. Today the static, always-SERVING case is two Arc::news, an Arc::clone, from_arc, and register — and the discoverable constructor (HealthService::new) seals the checker inside the service so there's nothing left to call set_status on. A crate-level convenience along the lines of install_static(router, [SERVICE_NAME]) -> (Router, Arc<StaticChecker>) (keeping from_arc for custom checkers) would close most of the gap to grpchealth/tonic-health. Related: checker(&self) -> &Arc<C> is an unusual return type — either return Arc<C> (clone internally) or have the doc spell out Arc::clone(svc.checker()), since "Clone it" currently steers users into clippy::clone_on_ref_ptr territory.
  3. Decide and document the set_status register-on-miss semantics. StaticChecker::set_status silently creates an entry for any name (static_checker.rs:61-76), so a typo'd service name "works" while probes for the real FQN keep reporting the old status, and the map grows if the name ever comes from non-static input. connect-go's StaticChecker.SetStatus errors on unknown names. Either split registration from updates (or add a strict try_set_status), or keep the current behaviour but say loudly on set_status itself that it registers on miss and must only be fed the generated *_SERVICE_NAME constants; a remove_service would round the API out either way.
  4. Guide fixes. The example at docs/guide.md:1001 doesn't compile: Status is used but not imported, and the Cargo snippet above it omits the connectrpc = { version = "0.6", features = ["server"] } dependency the example needs. Marking the block rust,no_run (like the lib.rs Quick Start) would let CI catch future drift.
  5. Crate metadata. Please add a readme (the sibling crates point at the workspace README) and bump the crate version to match the workspace now that 0.6.1 is out.

Fine as follow-ups (or in this PR if you prefer)

  • From<ServingStatus> for Status (or TryFrom) for probe-loop code decoding raw responses; the forward direction already exists.
  • The default Checker::watch returning Unimplemented is spec-legal, but it's easy for a custom-Checker user to ship without noticing Watch is dead — consider a doc callout right on the trait method (or a tracing warning) rather than only under # Errors.
  • A with_services_status(...) form so a service can be registered as NotServing from the start.
  • The #[allow(clippy::upper_case_acronyms)] on the generated-module wrappers appears unused.
  • An end-to-end test that a client disconnect mid-Watch drops the server-side subscriber (the WatchStream-drop unit test is convincing; the hyper-level version would tie it together).

On process: CI for first-time contributors needs a maintainer to approve the workflow run — we'll do that once the above lands, and the new "Check generated code" job that just merged should pass given the Taskfile/buf wiring you've already included. Happy to re-review quickly after the changes.

When set via `opt: [gate_client_feature]` (or
`Config::gate_client_feature(true)` in build.rs), the codegen prefixes
every emitted `FooClient<T>` struct and its `impl` block with
`#[cfg(feature = "client")]`. Consumer crates declare a `client` Cargo
feature so a server-only build trims its dependency
graph.

Default off, so external consumers with their own proto files aren't
forced to declare any Cargo feature.
Vendor the standard `grpc.health.v1.Health` proto. Wire up `buf generate`
behind a new `connectrpc-health:generate` task. Add the crate to the workspace,
and ship the `Status` Rust-side enum plus re-exports.
`Checker` is the user-facing trait: an async `check(service) -> Status`
plus an optional `watch(service) -> StatusStream` that defaults to
`Unimplemented`. `StatusStream` boxes a `Stream<Item = Status>` and
exposes a `from_watch(receiver)` shortcut for the common
`tokio::sync::watch` backed implementation.
`StaticChecker` is the batteries-included `Checker` for the common case:
a `Mutex<HashMap<String, watch::Sender<Status>>>` registers services
once and lets the user flip their status from outside the service via
`set_status`/ `shutdown`.

`watch()` reuses the existing `Sender` for both registered and `""`
services so concurrent watchers share one upstream.
`HealthService<C>` wraps an `Arc<C: Checker>` and implements the
generated `Health` trait, so registration follows the workspace
convention.

Round-trip tests spin up a Health server on a free TCP port and
exercise `Check` (serving / not-serving / unknown / post-mutation) and
`Watch` (initial + change notification, Unimplemented fallback) through
the wire-format generated client.

Also: fills out the crate-level docstring with a quick-start, adds the
`Health checking` section to `docs/guide.md`, and lists the new crate
in the workspace README.
@torkelrogstad
Copy link
Copy Markdown
Author

Thanks for your review on this @iainmcgin. I've added a new commit at the start that modifies the codegen to achieves the feature-gate you requested. The other commits have been amended to incorporate your feedback.

While working on the codegen changes I realized it would make sense to expose codegen knobs for disabling client/server generation all together, if you're only interested in one of them. Skipped this here, as that seemed like a bigger change. But curious what you think about that.

@torkelrogstad
Copy link
Copy Markdown
Author

@iainmcgin friendly ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants