Add HealthCheckRuntime context manager for shared boilerplate [1/2]#77
Open
gustcol wants to merge 2 commits intofacebookresearch:mainfrom
Open
Add HealthCheckRuntime context manager for shared boilerplate [1/2]#77gustcol wants to merge 2 commits intofacebookresearch:mainfrom
gustcol wants to merge 2 commits intofacebookresearch:mainfrom
Conversation
…plate Extract the ~30 lines of repeated setup code (logger init, GPU node ID detection, derived cluster resolution, TelemetryContext + OutputContext nesting, killswitch check) into a reusable HealthCheckRuntime dataclass context manager. This reduces per-subcommand boilerplate from ~30 lines to ~5 lines. The helper is purely additive — existing checks continue to work unchanged. New checks can use `with HealthCheckRuntime(...) as rt:` instead of manually wiring up the setup ceremony. Includes comprehensive tests covering field initialization, killswitch behavior, context manager nesting, GPU node ID failure handling, and the finish() convenience method. Refs: facebookresearch#75
CI CommandsThe following CI workflows run automatically on every push and pull request:
The following commands can be used by maintainers to trigger additional tests that require access to secrets:
|
This was referenced Mar 1, 2026
Apply ufmt formatting and fix mypy errors in test helper by using explicit typed parameters instead of **kwargs dict unpacking.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ref: #75
HealthCheckRuntime, a@dataclasscontext manager that encapsulates the ~30 lines of repeated setup code every health check subcommand duplicates: logger initialization, GPU node ID detection, derived cluster resolution,ExitStackwithTelemetryContext+OutputContext, and killswitch checkingwith HealthCheckRuntime(...) as rt:block)Stacked PR series: [1/2] Runtime helper → [2/2] Scaffold tool (depends on this PR)
Before (~30 lines per subcommand)
After (~5 lines per subcommand)
Test plan
nox -s tests -- gcm/tests/health_checks_tests/test_runtime.py— 6 tests covering initialization, killswitch behavior, finish(), context nesting, GPU node ID failurenox -s lintnox -s formatnox -s typecheck