Skip to content

feat: implement Phase 3 kdm analyze command#127

Merged
utkarsh232005 merged 7 commits into
KDM-cli:mainfrom
utkarsh232005:phase3
Jun 7, 2026
Merged

feat: implement Phase 3 kdm analyze command#127
utkarsh232005 merged 7 commits into
KDM-cli:mainfrom
utkarsh232005:phase3

Conversation

@utkarsh232005

@utkarsh232005 utkarsh232005 commented Jun 7, 2026

Copy link
Copy Markdown
Member

This PR implements Phase 3 of the project: kdm analyze command without AI, including Pod, Deployment, Service, PVC, and Node analyzers along with unit tests.

Summary by CodeRabbit

New Features

  • Added analyze command for Kubernetes cluster diagnostics
  • Detects pod crashes and restart issues, deployment replica failures, service misconfigurations, unschedulable nodes, and storage binding problems
  • Supports filtering by namespace and label selector
  • JSON and text output formats available
  • Configurable kubeconfig and context for multi-cluster support

@github-actions github-actions Bot added documentation Improvements or additions to documentation frontend labels Jun 7, 2026
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@utkarsh232005, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 9 minutes and 46 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 4a5f4527-2b3d-4666-a6e4-77bd5ac7b6bb

📥 Commits

Reviewing files that changed from the base of the PR and between 313c985 and 95f637a.

⛔ Files ignored due to path filters (2)
  • .DS_Store is excluded by !**/.DS_Store and included by none
  • coding_style.md is excluded by none and included by none
📒 Files selected for processing (13)
  • src/__tests__/analyze-command.test.ts
  • src/__tests__/kubernetes-analyzers.test.ts
  • src/analysis/analysis.ts
  • src/analysis/types.ts
  • src/analyzers/deployment.ts
  • src/analyzers/node.ts
  • src/analyzers/pod.ts
  • src/analyzers/pvc.ts
  • src/analyzers/service.ts
  • src/analyzers/types.ts
  • src/commands/analyze.ts
  • src/kubernetes/client.ts
  • src/kubernetes/resources.ts

Note

.coderabbit.yaml has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)
Validation error: Unrecognized key: "pre_merge_checks"
⚙️ Configuration instructions
  • Please see the configuration documentation for more information.
  • You can also validate your configuration using the online YAML validator.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
📝 Walkthrough

Walkthrough

This PR implements a complete Kubernetes analyzer system: five analyzer implementations detect pod crashes, deployment replica failures, service validation issues, PVC binding problems, and node pressure/scheduling anomalies. It adds configurable Kubernetes client support with kubeconfig/context overrides, resource listing helpers, an analyze CLI command, and comprehensive test coverage.

Changes

Kubernetes Analysis Feature

Layer / File(s) Summary
Kubernetes client and resource infrastructure
src/kubernetes/client.ts, src/kubernetes/resources.ts, src/analyzers/types.ts
Kubernetes client now supports configurable kubeconfig and context with per-key API client caching. Resource module wraps client APIs with namespace and label selector filtering for Pods, Deployments, Services, PVCs, and Nodes. Endpoints reading and label-to-selector conversion helpers added. AnalyzerContext extended with kubeconfig/kubecontext fields.
Analysis engine and option threading
src/analysis/types.ts, src/analysis/analysis.ts
AnalysisOptions gains optional kubeconfig and kubecontext fields. Engine threads these into the context object passed to each analyzer.
Analyzer implementations and registry
src/analyzers/pod.ts, src/analyzers/deployment.ts, src/analyzers/service.ts, src/analyzers/pvc.ts, src/analyzers/node.ts, src/analyzers/index.ts
Five analyzers detect Pod phase failures/restarts, Deployment unavailable replicas/progress deadlines, Service selector/endpoint/port validation, PVC pending/lost phases and binding conditions, and Node readiness/pressure/scheduling status. Registry imports and re-exports implementations.
Analyze CLI command and integration
src/commands/analyze.ts, src/commands/root.ts
New analyze/analyse command accepts namespace, label selector, analyzer filters, text/json output, max concurrency, and kubeconfig/context overrides; manages spinner, formats results, and sets exit code on error. Integrated into root command.
Test coverage
src/__tests__/analysis.test.ts, src/__tests__/analyze-command.test.ts, src/__tests__/kubernetes-analyzers.test.ts
Mocks for Kubernetes resources, full analyzer test scenarios (pod crashes, deployment failures, service validation, PVC binding, node pressure), and CLI option parsing and error handling validation.

Sequence Diagrams

sequenceDiagram
  participant User
  participant CLI as Analyze Command
  participant Engine as runAnalysis
  participant Analyzers
  participant K8s as Kubernetes API
  User->>CLI: analyze --namespace prod --filter pod,deployment
  CLI->>CLI: Validate output format, parse options
  CLI->>CLI: Start spinner (if text output)
  CLI->>Engine: runAnalysis(options with filters, namespace, kubeconfig)
  Engine->>Analyzers: forEach analyzer: analyze(context with kubeconfig/context)
  Analyzers->>K8s: listPods, listDeployments, etc (namespace, selector)
  K8s->>Analyzers: Resource lists
  Analyzers->>Analyzers: Detect failures per resource
  Analyzers->>Engine: AnalyzerResult[] with errors
  Engine->>CLI: Aggregated results
  CLI->>CLI: Format as JSON or text
  CLI->>User: Print output, stop spinner, exit 0 or 1
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The PR introduces substantial new analyzer logic across five separate implementations with distinct validation rules (pod phases/restarts, deployment replicas/conditions, service selector/endpoint/port resolution, PVC phases/conditions, node readiness/pressure). The resource infrastructure and client refactoring require careful review of caching logic and kubeconfig handling. Comprehensive test coverage aids review but the heterogeneity of analyzer failure detection logic (each checks different Kubernetes resource properties and states) and the multiple files involved (13 modified/added in src/ alone) demand careful reasoning per analyzer. Service analyzer's port resolution logic in particular is dense and worth careful attention to edge cases (string vs numeric targetPort handling).

Possibly related PRs

  • KDM-cli/kdm-cli#96: Extends Phase 2 analysis engine with kubeconfig/context threading into analyzer context, mirroring the same src/analysis/analysis.ts and type infrastructure.

Suggested labels

version2

Poem

🔍 Five analyzers wake to scan the cluster wide,
🚀 Pods and Nodes and Services, scrutinized with pride,
⚙️ Kubeconfig threads through the engine's beating heart,
📊 A CLI command completes the working part,
✨ Detect the crashes, replicas, and pain!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing the Phase 3 kdm analyze command, which is the primary objective of this PR containing analyzers and command registration.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@codecov

codecov Bot commented Jun 7, 2026

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/analysis/analysis.ts (1)

50-104: 🧹 Nitpick | 🔵 Trivial | 🏗️ Heavy lift

Consider adding cancellation support for long-running analyses.

The analysis engine correctly isolates analyzer failures and respects concurrency limits, but it lacks cancellation support. For CLI tools analyzing large clusters, users may need to cancel long-running operations (Ctrl+C). Consider threading an AbortSignal through AnalysisOptions and AnalyzerContext to enable graceful cancellation.

This would improve UX when analyzing large namespaces or clusters with many resources.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/analysis/analysis.ts` around lines 50 - 104, Add cancellation support by
threading an AbortSignal from AnalysisOptions into the AnalyzerContext and
honoring it in runAnalysis: extend AnalysisOptions to accept signal
(AbortSignal), add that signal to the context object built in runAnalysis, and
ensure each analyzer receives the signal via analyzer.analyze(context). In
runAnalysis, check signal.aborted before starting work in the worker loop and
after awaiting measureDuration to break/return early and push a cancellation
error if desired; also pass the signal into any helper functions (e.g.,
measureDuration) or short-circuit calls when aborted. Update any types
referencing AnalyzerContext/AnalysisOptions and callers that construct
AnalyzerContext so analyzers can bail out promptly when the signal is triggered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/__tests__/analyze-command.test.ts`:
- Around line 36-81: The test suite lacks coverage for error conditions around
runAnalysis and output validation—add tests that (1) mock runAnalysis to reject
(e.g., throw "K8s connection failed") and assert errorSpy is called and
process.exitCode is set to 1, (2) mock runAnalysis to resolve with a
ProblemDetected status or a non-empty errors array and assert error logging/exit
behavior, (3) verify text output format validation by calling program.parseAsync
with text output variations beyond JSON to assert proper error handling, and (4)
add edge-case tests for empty filters/missing namespace to confirm expected
behavior; use the existing runAnalysis mock, program.parseAsync invocations, and
spies (errorSpy, logSpy) in analyze-command.test.ts to implement these cases.
- Line 80: The test currently sets process.exitCode = undefined inline which may
not run on failures; add an afterEach hook in the test suite (same file/test
block where analyze-command tests run) that resets process.exitCode = undefined
to guarantee cleanup, i.e., add afterEach(() => { process.exitCode = undefined
}) near the existing beforeEach/afterAll hooks and remove the inline reset from
the test body (references: process.exitCode, afterEach).

In `@src/__tests__/kubernetes-analyzers.test.ts`:
- Around line 29-203: Add tests that simulate Kubernetes API failures by mocking
the relevant list/read helpers to reject (e.g.,
vi.mocked(listPods).mockRejectedValueOnce(new Error('API timeout'))), then
assert the analyzer behavior for PodAnalyzer.analyze,
DeploymentAnalyzer.analyze, ServiceAnalyzer.analyze,
PersistentVolumeClaimAnalyzer.analyze and NodeAnalyzer.analyze—either expecting
the promise to reject with the error or to resolve to an empty array depending
on the intended graceful-degradation behavior; ensure each test uses
vi.clearAllMocks() already present in beforeEach and references the exact helper
names (listPods, listDeployments, listServices, readEndpoints,
listPersistentVolumeClaims, listNodes) and analyzer symbols
(PodAnalyzer.analyze, DeploymentAnalyzer.analyze, ServiceAnalyzer.analyze,
PersistentVolumeClaimAnalyzer.analyze, NodeAnalyzer.analyze) so the test
additions are easy to locate and validate.

In `@src/analyzers/deployment.ts`:
- Around line 8-34: The replica availability checks in getDeploymentFailures
currently can emit two overlapping messages (the desired>available "Deployment
has {available}/{desired} available replicas" and the separate
unavailableReplicas message); change the logic so these are consolidated: when
desired > available emit only a single combined message like "Deployment has
{available}/{desired} available replica(s)" (handle pluralization) and skip
emitting the unavailableReplicas message in that branch; only emit the
unavailableReplicas message when desired <= available but
status.unavailableReplicas > 0. Keep all other condition checks
(Progressing/ProgressDeadlineExceeded and condition.status === 'False')
unchanged.

In `@src/analyzers/node.ts`:
- Around line 9-32: getNodeFailures has been flagged for complexity; split
responsibilities into small helpers to reduce cyclomatic complexity: extract
Ready check, pressure-condition check, and unschedulable check into separate
functions (e.g., isNodeReadyFailure(node: k8s.V1Node): Failure | null,
getPressureFailures(node: k8s.V1Node): Failure[], and
getUnschedulableFailure(node: k8s.V1Node): Failure | null), keep
PRESSURE_CONDITIONS usage inside the pressure helper, and have getNodeFailures
simply aggregate results from those helpers and return the combined Failure[].

In `@src/analyzers/pod.ts`:
- Around line 16-51: Static analysis flagged getPodFailures as a complex method;
refactor by extracting its logical sections into smaller helper functions to
preserve behavior and reduce complexity: create checkPodPhase(pod: k8s.V1Pod):
Failure[] to handle phase === 'Failed' logic, checkPodScheduling(pod:
k8s.V1Pod): Failure[] to handle Pending/PodScheduled checks, and
checkContainerStates(pod: k8s.V1Pod): Failure[] to iterate
pod.status?.containerStatuses and apply the WAITING_FAILURE_REASONS, readiness,
and RESTART_WARNING_THRESHOLD checks; then have getPodFailures return the
concatenation of those helpers' results so existing behavior and constants
remain unchanged.

In `@src/analyzers/pvc.ts`:
- Around line 8-32: getPvcFailures is flagged for cyclomatic complexity;
refactor it by extracting the distinct checks into small helper functions to
simplify the main flow: create helpers like checkPendingPhase(pvc:
V1PersistentVolumeClaim): Failure[], checkLostPhase(pvc:
V1PersistentVolumeClaim): Failure[], and collectStatusConditions(pvc:
V1PersistentVolumeClaim): Failure[] and have getPvcFailures call these helpers
and concatenate their results; keep the existing failure messages and the use of
pvc.status?.conditions and pvc.spec?.storageClassName so behavior remains
identical.
- Around line 12-17: When phase === 'Pending' avoid emitting both a generic
"PersistentVolumeClaim is Pending" and a more specific "pending without a
storage class"; instead, in the Pending handling (where phase and failures are
used and pvc.spec?.storageClassName is checked) push only the specific failure
if pvc.spec?.storageClassName is missing, otherwise push a single generic
Pending message — update the logic around the failures.push calls so the
storageClassName check controls whether to report the specific message and skip
the generic one.

In `@src/analyzers/service.ts`:
- Around line 11-75: getServiceFailures currently calls listPods per service
(via labelsToSelector -> listPods), causing N API calls; instead prefetch and
reuse pod lists per-namespace or batch services so you filter in-memory. Modify
the analyzer to fetch pods once per namespace (e.g., add a cached
listPodsCached(namespace, context) helper on AnalyzerContext or fetch all pods
before iterating services and pass matchingPods into getServiceFailures), then
replace the direct listPods call in getServiceFailures with a lookup from the
cache; alternatively, if services cannot share listings, run getServiceFailures
calls in parallel with Promise.all to avoid serial waits. Ensure you keep
selector logic (labelsToSelector) and existing behavior for ExternalName and
readEndpoints unchanged.
- Around line 83-93: The current sequential for-loop over services blocks
execution; change it to run analyses in parallel by mapping services to promises
that call getServiceFailures(service, context) and using Promise.allSettled(...)
on that array. After allSettled resolves, iterate the settled results: for each
fulfilled result, if the returned errors array has entries push the same object
shape (kind:'Service', name:serviceName(service),
namespace:serviceNamespace(service), errors) into results; for each rejected
result push or record a failure entry for that service (use
serviceName/serviceNamespace and include the rejection reason.message in errors)
so a single service error won’t abort the whole analysis. Ensure you keep
references to service in the map callback so you can build the result objects
after settling.

In `@src/commands/analyze.ts`:
- Around line 29-59: Add cancellation via an AbortController: create an
AbortController before calling runAnalysis, register process.on('SIGINT') to
call abortController.abort(), stop/fail the spinner (spinner?.fail('Analysis
cancelled')) and set process.exitCode = 130, and pass abortController.signal
into runAnalysis (update the runAnalysis call site and its signature/handlers to
accept a signal and wire it through to any Kubernetes API/client calls). Ensure
runAnalysis and any k8s client usage (e.g., watch/list/stream requests) respect
the provided AbortSignal so in-flight requests cancel cleanly.
- Around line 29-59: Replace direct console usage in the .action handler with
the project's logger: import logger from '../utils/logger', then use logger.info
or logger.debug instead of console.log for successful output (use logger.info
for formatTextOutput and logger.debug or logger.info for formatJsonOutput as
appropriate) and use logger.error instead of console.error when reporting errors
inside the catch block; also replace spinner?.stop/ fail messages that call
console with logger calls where appropriate or keep spinner messages but ensure
final output/error is emitted via logger (refer to the .action async callback,
variables output, spinner, runAnalysis, formatJsonOutput, formatTextOutput).

In `@src/kubernetes/client.ts`:
- Around line 21-31: Validate the kubeconfig path and surface loading errors
instead of silently ignoring them: before calling
kc.loadFromFile(options.kubeconfig) in the code that uses options.kubeconfig,
ensure options.kubeconfig is a safe, existing file within allowed boundaries
(e.g., resolve and check it is not using path traversal and that fs.statSync
reports a file), then call kc.loadFromFile; on any error from kc.loadFromFile,
kc.loadFromDefault, or kc.setCurrentContext, log the caught exception with
context (including the provided path and kubecontext) and rethrow or return an
error so callers can react; reference kc, options.kubeconfig, kc.loadFromFile,
kc.loadFromDefault, and kc.setCurrentContext when updating the try/catch
behavior.

In `@src/kubernetes/resources.ts`:
- Around line 70-81: The readEndpoints function currently swallows all errors
(using getK8sApi and unwrap) and returns undefined, which hides non-404
failures; update readEndpoints to catch errors from api.readNamespacedEndpoints,
inspect the error (e.g. HTTP status/code from the thrown k8s error) and only
return undefined for 404 NotFound, while re-throwing other errors so callers can
handle network/auth failures; alternatively return a result object { success:
boolean, endpoints?: V1Endpoints, error?: string } if you prefer explicit error
propagation—apply the change in the readEndpoints implementation and keep use of
getK8sApi and unwrap callers intact.

---

Outside diff comments:
In `@src/analysis/analysis.ts`:
- Around line 50-104: Add cancellation support by threading an AbortSignal from
AnalysisOptions into the AnalyzerContext and honoring it in runAnalysis: extend
AnalysisOptions to accept signal (AbortSignal), add that signal to the context
object built in runAnalysis, and ensure each analyzer receives the signal via
analyzer.analyze(context). In runAnalysis, check signal.aborted before starting
work in the worker loop and after awaiting measureDuration to break/return early
and push a cancellation error if desired; also pass the signal into any helper
functions (e.g., measureDuration) or short-circuit calls when aborted. Update
any types referencing AnalyzerContext/AnalysisOptions and callers that construct
AnalyzerContext so analyzers can bail out promptly when the signal is triggered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: abdf90f8-ecc9-48c1-8e14-6b71392be09e

📥 Commits

Reviewing files that changed from the base of the PR and between 986a6e7 and 313c985.

⛔ Files ignored due to path filters (2)
  • .gitignore is excluded by none and included by none
  • docs/what-is-kdm.md is excluded by none and included by none
📒 Files selected for processing (16)
  • src/__tests__/analysis.test.ts
  • src/__tests__/analyze-command.test.ts
  • src/__tests__/kubernetes-analyzers.test.ts
  • src/analysis/analysis.ts
  • src/analysis/types.ts
  • src/analyzers/deployment.ts
  • src/analyzers/index.ts
  • src/analyzers/node.ts
  • src/analyzers/pod.ts
  • src/analyzers/pvc.ts
  • src/analyzers/service.ts
  • src/analyzers/types.ts
  • src/commands/analyze.ts
  • src/commands/root.ts
  • src/kubernetes/client.ts
  • src/kubernetes/resources.ts

Comment thread src/__tests__/analyze-command.test.ts
Comment thread src/__tests__/analyze-command.test.ts Outdated
Comment thread src/__tests__/kubernetes-analyzers.test.ts
Comment thread src/analyzers/deployment.ts Outdated
Comment thread src/analyzers/node.ts Outdated
Comment thread src/analyzers/service.ts
Comment thread src/analyzers/service.ts Outdated
Comment thread src/commands/analyze.ts Outdated
Comment thread src/kubernetes/client.ts
Comment thread src/kubernetes/resources.ts
utkarsh232005 and others added 2 commits June 7, 2026 12:52
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates Passed
3 Quality Gates Passed

See analysis details in CodeScene

Quality Gate Profile: The Bare Minimum
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

@utkarsh232005 utkarsh232005 merged commit 9b45380 into KDM-cli:main Jun 7, 2026
11 of 12 checks passed
@utkarsh232005 utkarsh232005 added this to the v2.0.0 milestone Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant