feat: implement Phase 3 kdm analyze command#127
Conversation
|
Warning Review limit reached
More reviews will be available in 9 minutes and 46 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Plus Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (13)
Note
|
| Layer / File(s) | Summary |
|---|---|
Kubernetes client and resource infrastructure src/kubernetes/client.ts, src/kubernetes/resources.ts, src/analyzers/types.ts |
Kubernetes client now supports configurable kubeconfig and context with per-key API client caching. Resource module wraps client APIs with namespace and label selector filtering for Pods, Deployments, Services, PVCs, and Nodes. Endpoints reading and label-to-selector conversion helpers added. AnalyzerContext extended with kubeconfig/kubecontext fields. |
Analysis engine and option threading src/analysis/types.ts, src/analysis/analysis.ts |
AnalysisOptions gains optional kubeconfig and kubecontext fields. Engine threads these into the context object passed to each analyzer. |
Analyzer implementations and registry src/analyzers/pod.ts, src/analyzers/deployment.ts, src/analyzers/service.ts, src/analyzers/pvc.ts, src/analyzers/node.ts, src/analyzers/index.ts |
Five analyzers detect Pod phase failures/restarts, Deployment unavailable replicas/progress deadlines, Service selector/endpoint/port validation, PVC pending/lost phases and binding conditions, and Node readiness/pressure/scheduling status. Registry imports and re-exports implementations. |
Analyze CLI command and integration src/commands/analyze.ts, src/commands/root.ts |
New analyze/analyse command accepts namespace, label selector, analyzer filters, text/json output, max concurrency, and kubeconfig/context overrides; manages spinner, formats results, and sets exit code on error. Integrated into root command. |
Test coverage src/__tests__/analysis.test.ts, src/__tests__/analyze-command.test.ts, src/__tests__/kubernetes-analyzers.test.ts |
Mocks for Kubernetes resources, full analyzer test scenarios (pod crashes, deployment failures, service validation, PVC binding, node pressure), and CLI option parsing and error handling validation. |
Sequence Diagrams
sequenceDiagram
participant User
participant CLI as Analyze Command
participant Engine as runAnalysis
participant Analyzers
participant K8s as Kubernetes API
User->>CLI: analyze --namespace prod --filter pod,deployment
CLI->>CLI: Validate output format, parse options
CLI->>CLI: Start spinner (if text output)
CLI->>Engine: runAnalysis(options with filters, namespace, kubeconfig)
Engine->>Analyzers: forEach analyzer: analyze(context with kubeconfig/context)
Analyzers->>K8s: listPods, listDeployments, etc (namespace, selector)
K8s->>Analyzers: Resource lists
Analyzers->>Analyzers: Detect failures per resource
Analyzers->>Engine: AnalyzerResult[] with errors
Engine->>CLI: Aggregated results
CLI->>CLI: Format as JSON or text
CLI->>User: Print output, stop spinner, exit 0 or 1
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~25 minutes
The PR introduces substantial new analyzer logic across five separate implementations with distinct validation rules (pod phases/restarts, deployment replicas/conditions, service selector/endpoint/port resolution, PVC phases/conditions, node readiness/pressure). The resource infrastructure and client refactoring require careful review of caching logic and kubeconfig handling. Comprehensive test coverage aids review but the heterogeneity of analyzer failure detection logic (each checks different Kubernetes resource properties and states) and the multiple files involved (13 modified/added in src/ alone) demand careful reasoning per analyzer. Service analyzer's port resolution logic in particular is dense and worth careful attention to edge cases (string vs numeric targetPort handling).
Possibly related PRs
- KDM-cli/kdm-cli#96: Extends Phase 2 analysis engine with kubeconfig/context threading into analyzer context, mirroring the same
src/analysis/analysis.tsand type infrastructure.
Suggested labels
version2
Poem
🔍 Five analyzers wake to scan the cluster wide,
🚀 Pods and Nodes and Services, scrutinized with pride,
⚙️ Kubeconfig threads through the engine's beating heart,
📊 A CLI command completes the working part,
✨ Detect the crashes, replicas, and pain!
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | ✅ Passed | The title accurately describes the main change: implementing the Phase 3 kdm analyze command, which is the primary objective of this PR containing analyzers and command registration. |
| Linked Issues check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Out of Scope Changes check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/analysis/analysis.ts (1)
50-104: 🧹 Nitpick | 🔵 Trivial | 🏗️ Heavy liftConsider adding cancellation support for long-running analyses.
The analysis engine correctly isolates analyzer failures and respects concurrency limits, but it lacks cancellation support. For CLI tools analyzing large clusters, users may need to cancel long-running operations (Ctrl+C). Consider threading an
AbortSignalthroughAnalysisOptionsandAnalyzerContextto enable graceful cancellation.This would improve UX when analyzing large namespaces or clusters with many resources.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/analysis/analysis.ts` around lines 50 - 104, Add cancellation support by threading an AbortSignal from AnalysisOptions into the AnalyzerContext and honoring it in runAnalysis: extend AnalysisOptions to accept signal (AbortSignal), add that signal to the context object built in runAnalysis, and ensure each analyzer receives the signal via analyzer.analyze(context). In runAnalysis, check signal.aborted before starting work in the worker loop and after awaiting measureDuration to break/return early and push a cancellation error if desired; also pass the signal into any helper functions (e.g., measureDuration) or short-circuit calls when aborted. Update any types referencing AnalyzerContext/AnalysisOptions and callers that construct AnalyzerContext so analyzers can bail out promptly when the signal is triggered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/__tests__/analyze-command.test.ts`:
- Around line 36-81: The test suite lacks coverage for error conditions around
runAnalysis and output validation—add tests that (1) mock runAnalysis to reject
(e.g., throw "K8s connection failed") and assert errorSpy is called and
process.exitCode is set to 1, (2) mock runAnalysis to resolve with a
ProblemDetected status or a non-empty errors array and assert error logging/exit
behavior, (3) verify text output format validation by calling program.parseAsync
with text output variations beyond JSON to assert proper error handling, and (4)
add edge-case tests for empty filters/missing namespace to confirm expected
behavior; use the existing runAnalysis mock, program.parseAsync invocations, and
spies (errorSpy, logSpy) in analyze-command.test.ts to implement these cases.
- Line 80: The test currently sets process.exitCode = undefined inline which may
not run on failures; add an afterEach hook in the test suite (same file/test
block where analyze-command tests run) that resets process.exitCode = undefined
to guarantee cleanup, i.e., add afterEach(() => { process.exitCode = undefined
}) near the existing beforeEach/afterAll hooks and remove the inline reset from
the test body (references: process.exitCode, afterEach).
In `@src/__tests__/kubernetes-analyzers.test.ts`:
- Around line 29-203: Add tests that simulate Kubernetes API failures by mocking
the relevant list/read helpers to reject (e.g.,
vi.mocked(listPods).mockRejectedValueOnce(new Error('API timeout'))), then
assert the analyzer behavior for PodAnalyzer.analyze,
DeploymentAnalyzer.analyze, ServiceAnalyzer.analyze,
PersistentVolumeClaimAnalyzer.analyze and NodeAnalyzer.analyze—either expecting
the promise to reject with the error or to resolve to an empty array depending
on the intended graceful-degradation behavior; ensure each test uses
vi.clearAllMocks() already present in beforeEach and references the exact helper
names (listPods, listDeployments, listServices, readEndpoints,
listPersistentVolumeClaims, listNodes) and analyzer symbols
(PodAnalyzer.analyze, DeploymentAnalyzer.analyze, ServiceAnalyzer.analyze,
PersistentVolumeClaimAnalyzer.analyze, NodeAnalyzer.analyze) so the test
additions are easy to locate and validate.
In `@src/analyzers/deployment.ts`:
- Around line 8-34: The replica availability checks in getDeploymentFailures
currently can emit two overlapping messages (the desired>available "Deployment
has {available}/{desired} available replicas" and the separate
unavailableReplicas message); change the logic so these are consolidated: when
desired > available emit only a single combined message like "Deployment has
{available}/{desired} available replica(s)" (handle pluralization) and skip
emitting the unavailableReplicas message in that branch; only emit the
unavailableReplicas message when desired <= available but
status.unavailableReplicas > 0. Keep all other condition checks
(Progressing/ProgressDeadlineExceeded and condition.status === 'False')
unchanged.
In `@src/analyzers/node.ts`:
- Around line 9-32: getNodeFailures has been flagged for complexity; split
responsibilities into small helpers to reduce cyclomatic complexity: extract
Ready check, pressure-condition check, and unschedulable check into separate
functions (e.g., isNodeReadyFailure(node: k8s.V1Node): Failure | null,
getPressureFailures(node: k8s.V1Node): Failure[], and
getUnschedulableFailure(node: k8s.V1Node): Failure | null), keep
PRESSURE_CONDITIONS usage inside the pressure helper, and have getNodeFailures
simply aggregate results from those helpers and return the combined Failure[].
In `@src/analyzers/pod.ts`:
- Around line 16-51: Static analysis flagged getPodFailures as a complex method;
refactor by extracting its logical sections into smaller helper functions to
preserve behavior and reduce complexity: create checkPodPhase(pod: k8s.V1Pod):
Failure[] to handle phase === 'Failed' logic, checkPodScheduling(pod:
k8s.V1Pod): Failure[] to handle Pending/PodScheduled checks, and
checkContainerStates(pod: k8s.V1Pod): Failure[] to iterate
pod.status?.containerStatuses and apply the WAITING_FAILURE_REASONS, readiness,
and RESTART_WARNING_THRESHOLD checks; then have getPodFailures return the
concatenation of those helpers' results so existing behavior and constants
remain unchanged.
In `@src/analyzers/pvc.ts`:
- Around line 8-32: getPvcFailures is flagged for cyclomatic complexity;
refactor it by extracting the distinct checks into small helper functions to
simplify the main flow: create helpers like checkPendingPhase(pvc:
V1PersistentVolumeClaim): Failure[], checkLostPhase(pvc:
V1PersistentVolumeClaim): Failure[], and collectStatusConditions(pvc:
V1PersistentVolumeClaim): Failure[] and have getPvcFailures call these helpers
and concatenate their results; keep the existing failure messages and the use of
pvc.status?.conditions and pvc.spec?.storageClassName so behavior remains
identical.
- Around line 12-17: When phase === 'Pending' avoid emitting both a generic
"PersistentVolumeClaim is Pending" and a more specific "pending without a
storage class"; instead, in the Pending handling (where phase and failures are
used and pvc.spec?.storageClassName is checked) push only the specific failure
if pvc.spec?.storageClassName is missing, otherwise push a single generic
Pending message — update the logic around the failures.push calls so the
storageClassName check controls whether to report the specific message and skip
the generic one.
In `@src/analyzers/service.ts`:
- Around line 11-75: getServiceFailures currently calls listPods per service
(via labelsToSelector -> listPods), causing N API calls; instead prefetch and
reuse pod lists per-namespace or batch services so you filter in-memory. Modify
the analyzer to fetch pods once per namespace (e.g., add a cached
listPodsCached(namespace, context) helper on AnalyzerContext or fetch all pods
before iterating services and pass matchingPods into getServiceFailures), then
replace the direct listPods call in getServiceFailures with a lookup from the
cache; alternatively, if services cannot share listings, run getServiceFailures
calls in parallel with Promise.all to avoid serial waits. Ensure you keep
selector logic (labelsToSelector) and existing behavior for ExternalName and
readEndpoints unchanged.
- Around line 83-93: The current sequential for-loop over services blocks
execution; change it to run analyses in parallel by mapping services to promises
that call getServiceFailures(service, context) and using Promise.allSettled(...)
on that array. After allSettled resolves, iterate the settled results: for each
fulfilled result, if the returned errors array has entries push the same object
shape (kind:'Service', name:serviceName(service),
namespace:serviceNamespace(service), errors) into results; for each rejected
result push or record a failure entry for that service (use
serviceName/serviceNamespace and include the rejection reason.message in errors)
so a single service error won’t abort the whole analysis. Ensure you keep
references to service in the map callback so you can build the result objects
after settling.
In `@src/commands/analyze.ts`:
- Around line 29-59: Add cancellation via an AbortController: create an
AbortController before calling runAnalysis, register process.on('SIGINT') to
call abortController.abort(), stop/fail the spinner (spinner?.fail('Analysis
cancelled')) and set process.exitCode = 130, and pass abortController.signal
into runAnalysis (update the runAnalysis call site and its signature/handlers to
accept a signal and wire it through to any Kubernetes API/client calls). Ensure
runAnalysis and any k8s client usage (e.g., watch/list/stream requests) respect
the provided AbortSignal so in-flight requests cancel cleanly.
- Around line 29-59: Replace direct console usage in the .action handler with
the project's logger: import logger from '../utils/logger', then use logger.info
or logger.debug instead of console.log for successful output (use logger.info
for formatTextOutput and logger.debug or logger.info for formatJsonOutput as
appropriate) and use logger.error instead of console.error when reporting errors
inside the catch block; also replace spinner?.stop/ fail messages that call
console with logger calls where appropriate or keep spinner messages but ensure
final output/error is emitted via logger (refer to the .action async callback,
variables output, spinner, runAnalysis, formatJsonOutput, formatTextOutput).
In `@src/kubernetes/client.ts`:
- Around line 21-31: Validate the kubeconfig path and surface loading errors
instead of silently ignoring them: before calling
kc.loadFromFile(options.kubeconfig) in the code that uses options.kubeconfig,
ensure options.kubeconfig is a safe, existing file within allowed boundaries
(e.g., resolve and check it is not using path traversal and that fs.statSync
reports a file), then call kc.loadFromFile; on any error from kc.loadFromFile,
kc.loadFromDefault, or kc.setCurrentContext, log the caught exception with
context (including the provided path and kubecontext) and rethrow or return an
error so callers can react; reference kc, options.kubeconfig, kc.loadFromFile,
kc.loadFromDefault, and kc.setCurrentContext when updating the try/catch
behavior.
In `@src/kubernetes/resources.ts`:
- Around line 70-81: The readEndpoints function currently swallows all errors
(using getK8sApi and unwrap) and returns undefined, which hides non-404
failures; update readEndpoints to catch errors from api.readNamespacedEndpoints,
inspect the error (e.g. HTTP status/code from the thrown k8s error) and only
return undefined for 404 NotFound, while re-throwing other errors so callers can
handle network/auth failures; alternatively return a result object { success:
boolean, endpoints?: V1Endpoints, error?: string } if you prefer explicit error
propagation—apply the change in the readEndpoints implementation and keep use of
getK8sApi and unwrap callers intact.
---
Outside diff comments:
In `@src/analysis/analysis.ts`:
- Around line 50-104: Add cancellation support by threading an AbortSignal from
AnalysisOptions into the AnalyzerContext and honoring it in runAnalysis: extend
AnalysisOptions to accept signal (AbortSignal), add that signal to the context
object built in runAnalysis, and ensure each analyzer receives the signal via
analyzer.analyze(context). In runAnalysis, check signal.aborted before starting
work in the worker loop and after awaiting measureDuration to break/return early
and push a cancellation error if desired; also pass the signal into any helper
functions (e.g., measureDuration) or short-circuit calls when aborted. Update
any types referencing AnalyzerContext/AnalysisOptions and callers that construct
AnalyzerContext so analyzers can bail out promptly when the signal is triggered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: abdf90f8-ecc9-48c1-8e14-6b71392be09e
⛔ Files ignored due to path filters (2)
.gitignoreis excluded by none and included by nonedocs/what-is-kdm.mdis excluded by none and included by none
📒 Files selected for processing (16)
src/__tests__/analysis.test.tssrc/__tests__/analyze-command.test.tssrc/__tests__/kubernetes-analyzers.test.tssrc/analysis/analysis.tssrc/analysis/types.tssrc/analyzers/deployment.tssrc/analyzers/index.tssrc/analyzers/node.tssrc/analyzers/pod.tssrc/analyzers/pvc.tssrc/analyzers/service.tssrc/analyzers/types.tssrc/commands/analyze.tssrc/commands/root.tssrc/kubernetes/client.tssrc/kubernetes/resources.ts
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…e analyzer, and add JSDocs
…and for CodeScene delta gates
There was a problem hiding this comment.
Gates Passed
3 Quality Gates Passed
See analysis details in CodeScene
Quality Gate Profile: The Bare Minimum
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.
This PR implements Phase 3 of the project: kdm analyze command without AI, including Pod, Deployment, Service, PVC, and Node analyzers along with unit tests.
Summary by CodeRabbit
New Features
analyzecommand for Kubernetes cluster diagnostics