Add new connectrpc-health crate#128
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
Implemented via a new crate that is not yet merged into upstream: anthropics/connect-rust#128
Implemented via a new crate that is not yet merged into upstream: anthropics/connect-rust#128
iainmcgin
left a comment
There was a problem hiding this comment.
[claude code] Thanks — this is a strong contribution and we'd like to take it. We ran it through both a correctness/concurrency review and an API-ergonomics review; the hand-written code came out of both in very good shape: locks are block-scoped and never held across awaits, send_if_modified suppresses no-op wake-ups, dropping a Watch stream releases the subscriber (with a regression test for exactly that), mutex poisoning is recovered rather than propagated, peer-supplied service names are only ever used as lookup keys (no insertion path a caller can drive), and every documented behaviour of StaticChecker has a colocated test. The deliberate deviation on Watch for unknown services (NOT_FOUND rather than the spec's open stream with SERVICE_UNKNOWN, matching connect-go's grpchealth) is documented consistently in all four places it matters — that's fine by us. The buf/Taskfile wiring matches the other generated directories and the generated output follows the fully-qualified-paths convention.
The asks below are about the public API surface, since this becomes a published crate.
Before merge
- Feature-gate the client half.
connectrpc-health/Cargo.toml:16unconditionally enablesconnectrpc'sserverandclientfeatures, so a server that only exposes the health service still pulls in the whole client transport stack. Please add aclientfeature on this crate (off by default or default — your call) that forwards toconnectrpc/client, and gate theHealthClient/HealthExtre-exports behind it. - Make the common case one line and leave the user holding a usable handle. Today the static, always-SERVING case is two
Arc::news, anArc::clone,from_arc, andregister— and the discoverable constructor (HealthService::new) seals the checker inside the service so there's nothing left to callset_statuson. A crate-level convenience along the lines ofinstall_static(router, [SERVICE_NAME]) -> (Router, Arc<StaticChecker>)(keepingfrom_arcfor custom checkers) would close most of the gap to grpchealth/tonic-health. Related:checker(&self) -> &Arc<C>is an unusual return type — either returnArc<C>(clone internally) or have the doc spell outArc::clone(svc.checker()), since "Clone it" currently steers users intoclippy::clone_on_ref_ptrterritory. - Decide and document the
set_statusregister-on-miss semantics.StaticChecker::set_statussilently creates an entry for any name (static_checker.rs:61-76), so a typo'd service name "works" while probes for the real FQN keep reporting the old status, and the map grows if the name ever comes from non-static input. connect-go'sStaticChecker.SetStatuserrors on unknown names. Either split registration from updates (or add a stricttry_set_status), or keep the current behaviour but say loudly onset_statusitself that it registers on miss and must only be fed the generated*_SERVICE_NAMEconstants; aremove_servicewould round the API out either way. - Guide fixes. The example at
docs/guide.md:1001doesn't compile:Statusis used but not imported, and the Cargo snippet above it omits theconnectrpc = { version = "0.6", features = ["server"] }dependency the example needs. Marking the blockrust,no_run(like the lib.rs Quick Start) would let CI catch future drift. - Crate metadata. Please add a
readme(the sibling crates point at the workspace README) and bump the crate version to match the workspace now that 0.6.1 is out.
Fine as follow-ups (or in this PR if you prefer)
From<ServingStatus> for Status(orTryFrom) for probe-loop code decoding raw responses; the forward direction already exists.- The default
Checker::watchreturningUnimplementedis spec-legal, but it's easy for a custom-Checkeruser to ship without noticing Watch is dead — consider a doc callout right on the trait method (or atracingwarning) rather than only under# Errors. - A
with_services_status(...)form so a service can be registered as NotServing from the start. - The
#[allow(clippy::upper_case_acronyms)]on the generated-module wrappers appears unused. - An end-to-end test that a client disconnect mid-Watch drops the server-side subscriber (the WatchStream-drop unit test is convincing; the hyper-level version would tie it together).
On process: CI for first-time contributors needs a maintainer to approve the workflow run — we'll do that once the above lands, and the new "Check generated code" job that just merged should pass given the Taskfile/buf wiring you've already included. Happy to re-review quickly after the changes.
When set via `opt: [gate_client_feature]` (or `Config::gate_client_feature(true)` in build.rs), the codegen prefixes every emitted `FooClient<T>` struct and its `impl` block with `#[cfg(feature = "client")]`. Consumer crates declare a `client` Cargo feature so a server-only build trims its dependency graph. Default off, so external consumers with their own proto files aren't forced to declare any Cargo feature.
Vendor the standard `grpc.health.v1.Health` proto. Wire up `buf generate` behind a new `connectrpc-health:generate` task. Add the crate to the workspace, and ship the `Status` Rust-side enum plus re-exports.
`Checker` is the user-facing trait: an async `check(service) -> Status` plus an optional `watch(service) -> StatusStream` that defaults to `Unimplemented`. `StatusStream` boxes a `Stream<Item = Status>` and exposes a `from_watch(receiver)` shortcut for the common `tokio::sync::watch` backed implementation.
`StaticChecker` is the batteries-included `Checker` for the common case: a `Mutex<HashMap<String, watch::Sender<Status>>>` registers services once and lets the user flip their status from outside the service via `set_status`/ `shutdown`. `watch()` reuses the existing `Sender` for both registered and `""` services so concurrent watchers share one upstream.
`HealthService<C>` wraps an `Arc<C: Checker>` and implements the generated `Health` trait, so registration follows the workspace convention. Round-trip tests spin up a Health server on a free TCP port and exercise `Check` (serving / not-serving / unknown / post-mutation) and `Watch` (initial + change notification, Unimplemented fallback) through the wire-format generated client. Also: fills out the crate-level docstring with a quick-start, adds the `Health checking` section to `docs/guide.md`, and lists the new crate in the workspace README.
dfde93d to
238b699
Compare
|
Thanks for your review on this @iainmcgin. I've added a new commit at the start that modifies the codegen to achieves the feature-gate you requested. The other commits have been amended to incorporate your feedback. While working on the codegen changes I realized it would make sense to expose codegen knobs for disabling client/server generation all together, if you're only interested in one of them. Skipped this here, as that seemed like a bigger change. But curious what you think about that. |
|
@iainmcgin friendly ping |
Inspired by connectrpc.com/grpchealth and the equivalent Tonic implementation. Breaking up into four different commits felt kind of awkward, but did it per CONTRIBUTING.md guidelines.