feat: implement Phase 3 kdm analyze command by utkarsh232005 · Pull Request #127 · KDM-cli/kdm-cli

utkarsh232005 · 2026-06-07T07:12:15Z

This PR implements Phase 3 of the project: kdm analyze command without AI, including Pod, Deployment, Service, PVC, and Node analyzers along with unit tests.

Summary by CodeRabbit

New Features

Added analyze command for Kubernetes cluster diagnostics
Detects pod crashes and restart issues, deployment replica failures, service misconfigurations, unschedulable nodes, and storage binding problems
Supports filtering by namespace and label selector
JSON and text output formats available
Configurable kubeconfig and context for multi-cluster support

coderabbitai · 2026-06-07T07:12:26Z

Warning

Review limit reached

@utkarsh232005, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 9 minutes and 46 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 4a5f4527-2b3d-4666-a6e4-77bd5ac7b6bb

📥 Commits

Reviewing files that changed from the base of the PR and between 313c985 and 95f637a.

⛔ Files ignored due to path filters (2)

.DS_Store is excluded by !**/.DS_Store and included by none
coding_style.md is excluded by none and included by none

📒 Files selected for processing (13)

src/__tests__/analyze-command.test.ts
src/__tests__/kubernetes-analyzers.test.ts
src/analysis/analysis.ts
src/analysis/types.ts
src/analyzers/deployment.ts
src/analyzers/node.ts
src/analyzers/pod.ts
src/analyzers/pvc.ts
src/analyzers/service.ts
src/analyzers/types.ts
src/commands/analyze.ts
src/kubernetes/client.ts
src/kubernetes/resources.ts

Note

`.coderabbit.yaml` has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)

Validation error: Unrecognized key: "pre_merge_checks"

⚙️ Configuration instructions

Please see the configuration documentation for more information.
You can also validate your configuration using the online YAML validator.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

📝 Walkthrough

Walkthrough

This PR implements a complete Kubernetes analyzer system: five analyzer implementations detect pod crashes, deployment replica failures, service validation issues, PVC binding problems, and node pressure/scheduling anomalies. It adds configurable Kubernetes client support with kubeconfig/context overrides, resource listing helpers, an analyze CLI command, and comprehensive test coverage.

Changes

Kubernetes Analysis Feature

Layer / File(s)	Summary
Kubernetes client and resource infrastructure `src/kubernetes/client.ts`, `src/kubernetes/resources.ts`, `src/analyzers/types.ts`	Kubernetes client now supports configurable kubeconfig and context with per-key API client caching. Resource module wraps client APIs with namespace and label selector filtering for Pods, Deployments, Services, PVCs, and Nodes. Endpoints reading and label-to-selector conversion helpers added. `AnalyzerContext` extended with kubeconfig/kubecontext fields.
Analysis engine and option threading `src/analysis/types.ts`, `src/analysis/analysis.ts`	`AnalysisOptions` gains optional `kubeconfig` and `kubecontext` fields. Engine threads these into the context object passed to each analyzer.
Analyzer implementations and registry `src/analyzers/pod.ts`, `src/analyzers/deployment.ts`, `src/analyzers/service.ts`, `src/analyzers/pvc.ts`, `src/analyzers/node.ts`, `src/analyzers/index.ts`	Five analyzers detect Pod phase failures/restarts, Deployment unavailable replicas/progress deadlines, Service selector/endpoint/port validation, PVC pending/lost phases and binding conditions, and Node readiness/pressure/scheduling status. Registry imports and re-exports implementations.
Analyze CLI command and integration `src/commands/analyze.ts`, `src/commands/root.ts`	New `analyze`/`analyse` command accepts namespace, label selector, analyzer filters, text/json output, max concurrency, and kubeconfig/context overrides; manages spinner, formats results, and sets exit code on error. Integrated into root command.
Test coverage `src/__tests__/analysis.test.ts`, `src/__tests__/analyze-command.test.ts`, `src/__tests__/kubernetes-analyzers.test.ts`	Mocks for Kubernetes resources, full analyzer test scenarios (pod crashes, deployment failures, service validation, PVC binding, node pressure), and CLI option parsing and error handling validation.

Sequence Diagrams

sequenceDiagram
  participant User
  participant CLI as Analyze Command
  participant Engine as runAnalysis
  participant Analyzers
  participant K8s as Kubernetes API
  User->>CLI: analyze --namespace prod --filter pod,deployment
  CLI->>CLI: Validate output format, parse options
  CLI->>CLI: Start spinner (if text output)
  CLI->>Engine: runAnalysis(options with filters, namespace, kubeconfig)
  Engine->>Analyzers: forEach analyzer: analyze(context with kubeconfig/context)
  Analyzers->>K8s: listPods, listDeployments, etc (namespace, selector)
  K8s->>Analyzers: Resource lists
  Analyzers->>Analyzers: Detect failures per resource
  Analyzers->>Engine: AnalyzerResult[] with errors
  Engine->>CLI: Aggregated results
  CLI->>CLI: Format as JSON or text
  CLI->>User: Print output, stop spinner, exit 0 or 1

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The PR introduces substantial new analyzer logic across five separate implementations with distinct validation rules (pod phases/restarts, deployment replicas/conditions, service selector/endpoint/port resolution, PVC phases/conditions, node readiness/pressure). The resource infrastructure and client refactoring require careful review of caching logic and kubeconfig handling. Comprehensive test coverage aids review but the heterogeneity of analyzer failure detection logic (each checks different Kubernetes resource properties and states) and the multiple files involved (13 modified/added in src/ alone) demand careful reasoning per analyzer. Service analyzer's port resolution logic in particular is dense and worth careful attention to edge cases (string vs numeric targetPort handling).

Possibly related PRs

KDM-cli/kdm-cli#96: Extends Phase 2 analysis engine with kubeconfig/context threading into analyzer context, mirroring the same src/analysis/analysis.ts and type infrastructure.

Suggested labels

version2

Poem

🔍 Five analyzers wake to scan the cluster wide,
🚀 Pods and Nodes and Services, scrutinized with pride,
⚙️ Kubeconfig threads through the engine's beating heart,
📊 A CLI command completes the working part,
✨ Detect the crashes, replicas, and pain!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: implementing the Phase 3 kdm analyze command, which is the primary objective of this PR containing analyzers and command registration.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-06-07T07:13:07Z

Codecov Report

❌ Patch coverage is 66.42857% with 94 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/analyzers/service.ts	75.82%	5 Missing and 17 partials ⚠️
src/kubernetes/client.ts	44.82%	13 Missing and 3 partials ⚠️
src/analyzers/deployment.ts	51.72%	2 Missing and 12 partials ⚠️
src/analyzers/pod.ts	62.85%	5 Missing and 8 partials ⚠️
src/analyzers/node.ts	66.66%	2 Missing and 9 partials ⚠️
src/analyzers/pvc.ts	56.00%	2 Missing and 9 partials ⚠️
src/commands/analyze.ts	83.78%	4 Missing and 2 partials ⚠️
src/analysis/analysis.ts	0.00%	0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/analysis/analysis.ts (1)
50-104: 🧹 Nitpick | 🔵 Trivial | 🏗️ Heavy lift

Consider adding cancellation support for long-running analyses.

The analysis engine correctly isolates analyzer failures and respects concurrency limits, but it lacks cancellation support. For CLI tools analyzing large clusters, users may need to cancel long-running operations (Ctrl+C). Consider threading an AbortSignal through AnalysisOptions and AnalyzerContext to enable graceful cancellation.

This would improve UX when analyzing large namespaces or clusters with many resources.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/analysis/analysis.ts` around lines 50 - 104, Add cancellation support by
threading an AbortSignal from AnalysisOptions into the AnalyzerContext and
honoring it in runAnalysis: extend AnalysisOptions to accept signal
(AbortSignal), add that signal to the context object built in runAnalysis, and
ensure each analyzer receives the signal via analyzer.analyze(context). In
runAnalysis, check signal.aborted before starting work in the worker loop and
after awaiting measureDuration to break/return early and push a cancellation
error if desired; also pass the signal into any helper functions (e.g.,
measureDuration) or short-circuit calls when aborted. Update any types
referencing AnalyzerContext/AnalysisOptions and callers that construct
AnalyzerContext so analyzers can bail out promptly when the signal is triggered.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/__tests__/analyze-command.test.ts`:
- Around line 36-81: The test suite lacks coverage for error conditions around
runAnalysis and output validation—add tests that (1) mock runAnalysis to reject
(e.g., throw "K8s connection failed") and assert errorSpy is called and
process.exitCode is set to 1, (2) mock runAnalysis to resolve with a
ProblemDetected status or a non-empty errors array and assert error logging/exit
behavior, (3) verify text output format validation by calling program.parseAsync
with text output variations beyond JSON to assert proper error handling, and (4)
add edge-case tests for empty filters/missing namespace to confirm expected
behavior; use the existing runAnalysis mock, program.parseAsync invocations, and
spies (errorSpy, logSpy) in analyze-command.test.ts to implement these cases.
- Line 80: The test currently sets process.exitCode = undefined inline which may
not run on failures; add an afterEach hook in the test suite (same file/test
block where analyze-command tests run) that resets process.exitCode = undefined
to guarantee cleanup, i.e., add afterEach(() => { process.exitCode = undefined
}) near the existing beforeEach/afterAll hooks and remove the inline reset from
the test body (references: process.exitCode, afterEach).

In `@src/__tests__/kubernetes-analyzers.test.ts`:
- Around line 29-203: Add tests that simulate Kubernetes API failures by mocking
the relevant list/read helpers to reject (e.g.,
vi.mocked(listPods).mockRejectedValueOnce(new Error('API timeout'))), then
assert the analyzer behavior for PodAnalyzer.analyze,
DeploymentAnalyzer.analyze, ServiceAnalyzer.analyze,
PersistentVolumeClaimAnalyzer.analyze and NodeAnalyzer.analyze—either expecting
the promise to reject with the error or to resolve to an empty array depending
on the intended graceful-degradation behavior; ensure each test uses
vi.clearAllMocks() already present in beforeEach and references the exact helper
names (listPods, listDeployments, listServices, readEndpoints,
listPersistentVolumeClaims, listNodes) and analyzer symbols
(PodAnalyzer.analyze, DeploymentAnalyzer.analyze, ServiceAnalyzer.analyze,
PersistentVolumeClaimAnalyzer.analyze, NodeAnalyzer.analyze) so the test
additions are easy to locate and validate.

In `@src/analyzers/deployment.ts`:
- Around line 8-34: The replica availability checks in getDeploymentFailures
currently can emit two overlapping messages (the desired>available "Deployment
has {available}/{desired} available replicas" and the separate
unavailableReplicas message); change the logic so these are consolidated: when
desired > available emit only a single combined message like "Deployment has
{available}/{desired} available replica(s)" (handle pluralization) and skip
emitting the unavailableReplicas message in that branch; only emit the
unavailableReplicas message when desired <= available but
status.unavailableReplicas > 0. Keep all other condition checks
(Progressing/ProgressDeadlineExceeded and condition.status === 'False')
unchanged.

In `@src/analyzers/node.ts`:
- Around line 9-32: getNodeFailures has been flagged for complexity; split
responsibilities into small helpers to reduce cyclomatic complexity: extract
Ready check, pressure-condition check, and unschedulable check into separate
functions (e.g., isNodeReadyFailure(node: k8s.V1Node): Failure | null,
getPressureFailures(node: k8s.V1Node): Failure[], and
getUnschedulableFailure(node: k8s.V1Node): Failure | null), keep
PRESSURE_CONDITIONS usage inside the pressure helper, and have getNodeFailures
simply aggregate results from those helpers and return the combined Failure[].

In `@src/analyzers/pod.ts`:
- Around line 16-51: Static analysis flagged getPodFailures as a complex method;
refactor by extracting its logical sections into smaller helper functions to
preserve behavior and reduce complexity: create checkPodPhase(pod: k8s.V1Pod):
Failure[] to handle phase === 'Failed' logic, checkPodScheduling(pod:
k8s.V1Pod): Failure[] to handle Pending/PodScheduled checks, and
checkContainerStates(pod: k8s.V1Pod): Failure[] to iterate
pod.status?.containerStatuses and apply the WAITING_FAILURE_REASONS, readiness,
and RESTART_WARNING_THRESHOLD checks; then have getPodFailures return the
concatenation of those helpers' results so existing behavior and constants
remain unchanged.

In `@src/analyzers/pvc.ts`:
- Around line 8-32: getPvcFailures is flagged for cyclomatic complexity;
refactor it by extracting the distinct checks into small helper functions to
simplify the main flow: create helpers like checkPendingPhase(pvc:
V1PersistentVolumeClaim): Failure[], checkLostPhase(pvc:
V1PersistentVolumeClaim): Failure[], and collectStatusConditions(pvc:
V1PersistentVolumeClaim): Failure[] and have getPvcFailures call these helpers
and concatenate their results; keep the existing failure messages and the use of
pvc.status?.conditions and pvc.spec?.storageClassName so behavior remains
identical.
- Around line 12-17: When phase === 'Pending' avoid emitting both a generic
"PersistentVolumeClaim is Pending" and a more specific "pending without a
storage class"; instead, in the Pending handling (where phase and failures are
used and pvc.spec?.storageClassName is checked) push only the specific failure
if pvc.spec?.storageClassName is missing, otherwise push a single generic
Pending message — update the logic around the failures.push calls so the
storageClassName check controls whether to report the specific message and skip
the generic one.

In `@src/analyzers/service.ts`:
- Around line 11-75: getServiceFailures currently calls listPods per service
(via labelsToSelector -> listPods), causing N API calls; instead prefetch and
reuse pod lists per-namespace or batch services so you filter in-memory. Modify
the analyzer to fetch pods once per namespace (e.g., add a cached
listPodsCached(namespace, context) helper on AnalyzerContext or fetch all pods
before iterating services and pass matchingPods into getServiceFailures), then
replace the direct listPods call in getServiceFailures with a lookup from the
cache; alternatively, if services cannot share listings, run getServiceFailures
calls in parallel with Promise.all to avoid serial waits. Ensure you keep
selector logic (labelsToSelector) and existing behavior for ExternalName and
readEndpoints unchanged.
- Around line 83-93: The current sequential for-loop over services blocks
execution; change it to run analyses in parallel by mapping services to promises
that call getServiceFailures(service, context) and using Promise.allSettled(...)
on that array. After allSettled resolves, iterate the settled results: for each
fulfilled result, if the returned errors array has entries push the same object
shape (kind:'Service', name:serviceName(service),
namespace:serviceNamespace(service), errors) into results; for each rejected
result push or record a failure entry for that service (use
serviceName/serviceNamespace and include the rejection reason.message in errors)
so a single service error won’t abort the whole analysis. Ensure you keep
references to service in the map callback so you can build the result objects
after settling.

In `@src/commands/analyze.ts`:
- Around line 29-59: Add cancellation via an AbortController: create an
AbortController before calling runAnalysis, register process.on('SIGINT') to
call abortController.abort(), stop/fail the spinner (spinner?.fail('Analysis
cancelled')) and set process.exitCode = 130, and pass abortController.signal
into runAnalysis (update the runAnalysis call site and its signature/handlers to
accept a signal and wire it through to any Kubernetes API/client calls). Ensure
runAnalysis and any k8s client usage (e.g., watch/list/stream requests) respect
the provided AbortSignal so in-flight requests cancel cleanly.
- Around line 29-59: Replace direct console usage in the .action handler with
the project's logger: import logger from '../utils/logger', then use logger.info
or logger.debug instead of console.log for successful output (use logger.info
for formatTextOutput and logger.debug or logger.info for formatJsonOutput as
appropriate) and use logger.error instead of console.error when reporting errors
inside the catch block; also replace spinner?.stop/ fail messages that call
console with logger calls where appropriate or keep spinner messages but ensure
final output/error is emitted via logger (refer to the .action async callback,
variables output, spinner, runAnalysis, formatJsonOutput, formatTextOutput).

In `@src/kubernetes/client.ts`:
- Around line 21-31: Validate the kubeconfig path and surface loading errors
instead of silently ignoring them: before calling
kc.loadFromFile(options.kubeconfig) in the code that uses options.kubeconfig,
ensure options.kubeconfig is a safe, existing file within allowed boundaries
(e.g., resolve and check it is not using path traversal and that fs.statSync
reports a file), then call kc.loadFromFile; on any error from kc.loadFromFile,
kc.loadFromDefault, or kc.setCurrentContext, log the caught exception with
context (including the provided path and kubecontext) and rethrow or return an
error so callers can react; reference kc, options.kubeconfig, kc.loadFromFile,
kc.loadFromDefault, and kc.setCurrentContext when updating the try/catch
behavior.

In `@src/kubernetes/resources.ts`:
- Around line 70-81: The readEndpoints function currently swallows all errors
(using getK8sApi and unwrap) and returns undefined, which hides non-404
failures; update readEndpoints to catch errors from api.readNamespacedEndpoints,
inspect the error (e.g. HTTP status/code from the thrown k8s error) and only
return undefined for 404 NotFound, while re-throwing other errors so callers can
handle network/auth failures; alternatively return a result object { success:
boolean, endpoints?: V1Endpoints, error?: string } if you prefer explicit error
propagation—apply the change in the readEndpoints implementation and keep use of
getK8sApi and unwrap callers intact.

---

Outside diff comments:
In `@src/analysis/analysis.ts`:
- Around line 50-104: Add cancellation support by threading an AbortSignal from
AnalysisOptions into the AnalyzerContext and honoring it in runAnalysis: extend
AnalysisOptions to accept signal (AbortSignal), add that signal to the context
object built in runAnalysis, and ensure each analyzer receives the signal via
analyzer.analyze(context). In runAnalysis, check signal.aborted before starting
work in the worker loop and after awaiting measureDuration to break/return early
and push a cancellation error if desired; also pass the signal into any helper
functions (e.g., measureDuration) or short-circuit calls when aborted. Update
any types referencing AnalyzerContext/AnalysisOptions and callers that construct
AnalyzerContext so analyzers can bail out promptly when the signal is triggered.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: abdf90f8-ecc9-48c1-8e14-6b71392be09e

📥 Commits

Reviewing files that changed from the base of the PR and between 986a6e7 and 313c985.

⛔ Files ignored due to path filters (2)

.gitignore is excluded by none and included by none
docs/what-is-kdm.md is excluded by none and included by none

📒 Files selected for processing (16)

src/__tests__/analysis.test.ts
src/__tests__/analyze-command.test.ts
src/__tests__/kubernetes-analyzers.test.ts
src/analysis/analysis.ts
src/analysis/types.ts
src/analyzers/deployment.ts
src/analyzers/index.ts
src/analyzers/node.ts
src/analyzers/pod.ts
src/analyzers/pvc.ts
src/analyzers/service.ts
src/analyzers/types.ts
src/commands/analyze.ts
src/commands/root.ts
src/kubernetes/client.ts
src/kubernetes/resources.ts

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

…e analyzer, and add JSDocs

…validation

…and for CodeScene delta gates

codescene-delta-analysis

Gates Passed
3 Quality Gates Passed

See analysis details in CodeScene

Quality Gate Profile: The Bare Minimum
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

feat: implement Phase 3 kdm analyze command and core analyzers

313c985

github-actions Bot added documentation Improvements or additions to documentation frontend labels Jun 7, 2026