Skip to content

feat(verifier): implement event-driven watching using kubewatch#8

Open
yuwenma wants to merge 2 commits into
kubernetes-sigs:mainfrom
yuwenma:feature/verifier-watch
Open

feat(verifier): implement event-driven watching using kubewatch#8
yuwenma wants to merge 2 commits into
kubernetes-sigs:mainfrom
yuwenma:feature/verifier-watch

Conversation

@yuwenma

@yuwenma yuwenma commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

This PR transitions the Verifier Agent from a kubectl polling loops to an event-driven architecture powered by robusta-dev/kubewatch

Tests

  • Ran make test locally. All 8 tests passed.
  • E2E GKE Integration Validation: Executed a real cluster test script. The verifier successfully detected the pod creation/readiness event via the webhook receiver and exited in 7.84 seconds (without waiting for the 60s fallback timeout).

Main changes:

  • Event-Driven Verifiers: Refactored pod_healthy and scaling_complete verifiers to block using threading.Event.wait(). They wake up and perform a single quick check only when notified by the webhook callback.
  • Vendor-Neutral Environment Isolation: Configures the kubewatch daemon using the KW_CONFIG environment variable to isolate the .kubewatch.yaml config within a temporary directory.
  • Local Webhook Receiver: Implements a lightweight HTTP webhook server on an ephemeral random port to receive real-time, structured JSON notifications of resource updates from kubewatch.

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yuwenma
Once this PR has been reviewed and has the lgtm label, please assign janetkuo for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from janetkuo June 11, 2026 18:07
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 11, 2026
@yuwenma

yuwenma commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

This PR depends on #5

cc @pradeepvrd

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new verifier subsystem that shifts pod/deployment verification from kubectl polling loops to an event-driven model using a locally-run kubewatch daemon and a local webhook receiver, with Pydantic-based spec parsing and accompanying tests.

Changes:

  • Added an event-driven watcher service (KubeWatchService) with an embedded local HTTP webhook receiver to wake verifiers on resource events.
  • Implemented pod_healthy and scaling_complete verifiers that block on threading.Event.wait() and re-check state only after notifications.
  • Introduced a typed VerificationSpec (Pydantic v2 RootModel) plus a VerifierAgent for evaluating single and compound verification specs, with tests and documentation.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
pyproject.toml Adds pydantic>=2.0.0 dependency for spec/result models.
Makefile Adds install/test targets and a kubewatch install helper.
devops_bench/agents/verifier/watcher.py Adds webhook server + kubewatch process lifecycle management.
devops_bench/agents/verifier/verifiers/pod_healthy_verifier.py Adds event-driven pod readiness verifier.
devops_bench/agents/verifier/verifiers/scaling_complete_verifier.py Adds event-driven deployment scaling verifier.
devops_bench/agents/verifier/verifiers/init.py Initializes verifiers package.
devops_bench/agents/verifier/verifier.py Adds agent entrypoint that executes single/compound verification specs.
devops_bench/agents/verifier/utils.py Adds Timer and kubectl default timeout constant.
devops_bench/agents/verifier/test_verifier.py Adds unit tests covering verifiers and compound spec execution.
devops_bench/agents/verifier/spec.py Defines the typed Pydantic RootModel spec format.
devops_bench/agents/verifier/README.md Documents the verifier API and event-driven kubewatch architecture.
devops_bench/agents/verifier/base.py Adds base verifier interface and VerificationResult model.
devops_bench/agents/verifier/init.py Initializes verifier package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Makefile Outdated
Comment thread devops_bench/agents/verifier/watcher.py
Comment thread devops_bench/agents/verifier/watcher.py
Comment thread devops_bench/agents/verifier/watcher.py
Comment thread devops_bench/agents/verifier/watcher.py
Comment thread devops_bench/agents/verifier/watcher.py
Comment thread devops_bench/agents/verifier/verifiers/pod_healthy_verifier.py
Comment thread devops_bench/agents/verifier/spec.py Outdated
@yuwenma yuwenma force-pushed the feature/verifier-watch branch from e820ad5 to c1732d6 Compare June 12, 2026 17:00
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 12, 2026
@janetkuo

Copy link
Copy Markdown
Member

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 12, 2026
Comment thread Makefile Outdated
@yuwenma yuwenma force-pushed the feature/verifier-watch branch 2 times, most recently from ecc4c58 to c803343 Compare June 12, 2026 20:36
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 12, 2026
@yuwenma yuwenma force-pushed the feature/verifier-watch branch from c803343 to 5ac885c Compare June 12, 2026 20:46
@janetkuo

Copy link
Copy Markdown
Member

/hold

Per discussion with @pradeepvrd, we're considering some refactor before merging this. Hold this PR for now.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 15, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants