feat: BrowserUse Scenario Support#300
Conversation
Signed-off-by: Jagjeevan Kashid <jagjeevandev97@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Adds first-class BrowserUse scenario support to AgentCube, extending the existing AgentRuntime/CodeInterpreter end-to-end flow (CRD → Workload Manager provisioning → Router invocation proxying).
Changes:
- Introduces
BrowserUseCRD types (plus generated deepcopy + registration metadata) and an example manifest. - Extends Workload Manager informers/routes/handlers and workload builder to create sandboxes from
BrowserUsetemplates. - Extends Router routes/handlers/session-manager to proxy BrowserUse invocations and create BrowserUse sandboxes via Workload Manager.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/workloadmanager/workload_builder.go | Adds sandbox build path for BrowserUse resources. |
| pkg/workloadmanager/server.go | Adds WM management routes for BrowserUse create/delete. |
| pkg/workloadmanager/informers.go | Adds BrowserUse informer + cache sync gating. |
| pkg/workloadmanager/informers_test.go | Updates informer sync tests to include BrowserUse. |
| pkg/workloadmanager/handlers.go | Adds BrowserUse create handler and kind switch branch. |
| pkg/router/session_manager.go | Adds BrowserUse → WM endpoint mapping for sandbox creation. |
| pkg/router/server.go | Adds BrowserUse invocation routes. |
| pkg/router/handlers.go | Adds BrowserUse invocation handler wired into common invoke path. |
| pkg/common/types/types.go | Adds BrowserUseKind constant. |
| pkg/common/types/sandbox.go | Allows BrowserUseKind in CreateSandboxRequest validation. |
| pkg/api/errors.go | Adds BrowserUse “not found” error and resource mapping. |
| pkg/apis/runtime/v1alpha1/browseruse_types.go | New BrowserUse CRD Go types. |
| pkg/apis/runtime/v1alpha1/register.go | Adds BrowserUse GVK metadata. |
| pkg/apis/runtime/v1alpha1/zz_generated.deepcopy.go | Generated deepcopy for BrowserUse types. |
| example/browser-agent/browseruse.yaml | Example BrowserUse custom resource. |
| docs/design/browseruse-proposal.md | RFC-style design/proposal documentation for BrowserUse. |
| // handleBrowserUseCreate handles BrowserUse sandbox creation requests. | ||
| func (s *Server) handleBrowserUseCreate(c *gin.Context) { | ||
| s.handleSandboxCreate(c, types.BrowserUseKind) | ||
| } |
There was a problem hiding this comment.
There’s no unit test coverage for the new BrowserUse create path. TestHandleSandboxCreate currently covers AgentRuntime and CodeInterpreter, but doesn’t exercise handleBrowserUseCreate / the types.BrowserUseKind branch, so regressions in this new endpoint won’t be caught.
| // handleBrowserUseInvoke handles browser use invocation requests | ||
| func (s *Server) handleBrowserUseInvoke(c *gin.Context) { | ||
| namespace := c.Param("namespace") | ||
| name := c.Param("name") | ||
| path := c.Param("path") | ||
| s.handleInvoke(c, namespace, name, path, types.BrowserUseKind) | ||
| } |
There was a problem hiding this comment.
No router handler test exercises the new BrowserUse invocation route/handler. handlers_test.go has coverage for AgentRuntime and CodeInterpreter invocations, but nothing for /browser-uses/.../invocations/*path, so this new path could break without failing tests.
| endpoint = m.workloadMgrAddr + "/v1/agent-runtime" | ||
| case types.CodeInterpreterKind: | ||
| endpoint = m.workloadMgrAddr + "/v1/code-interpreter" | ||
| case types.BrowserUseKind: | ||
| endpoint = m.workloadMgrAddr + "/v1/browser-use" |
There was a problem hiding this comment.
createSandbox gained a new BrowserUseKind endpoint mapping, but session_manager_test.go doesn’t cover this case (it only verifies AgentRuntime and CodeInterpreter). Add a test asserting the request hits /v1/browser-use and marshals kind=BrowserUse.
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #300 +/- ##
==========================================
+ Coverage 43.37% 45.55% +2.18%
==========================================
Files 30 31 +1
Lines 2610 2961 +351
==========================================
+ Hits 1132 1349 +217
- Misses 1355 1472 +117
- Partials 123 140 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request introduces the BrowserUse CRD to AgentCube for browser automation workloads, including updates to the workload manager, router, and session management. Feedback focuses on improving robustness by adding a nil check for the pod template and handling ignored errors in the workload builder. Additionally, there are suggestions to standardize the targetPort field name to ports for API consistency and to fix a local file link in the design document.
| name: playwright-mcp | ||
| namespace: default | ||
| spec: | ||
| targetPort: |
| name: playwright-mcp | ||
| namespace: default | ||
| spec: | ||
| targetPort: |
| // Ports is a list of ports that the browser runtime will expose. | ||
| // Typically includes the browser automation protocol port (e.g., Playwright MCP on 8931) | ||
| // and optionally a VNC port for visual debugging. | ||
| Ports []TargetPort `json:"targetPort"` |
There was a problem hiding this comment.
The field Ports is a slice, but its JSON tag is targetPort (singular). Following Kubernetes and Go conventions, plural fields should typically have plural JSON tags (e.g., ports or targetPorts) to improve API clarity and consistency.
| Ports []TargetPort `json:"targetPort"` | |
| Ports []TargetPort `json:"ports"` |
Run make gen-client to generate typed clients, listers, and informers for the new BrowserUse CRD. Also fix update-codegen.sh to patch the GroupResource() call for browseruse in the generated lister (same fix applied to agentruntime and codeinterpreter). Signed-off-by: Jagjeevan Kashid <jagjeevandev97@gmail.com>
The BrowserUseInformer requires list/watch permissions on the browseruses resource. Without this, the workload manager pod cannot start (WaitForCacheSync times out causing Helm --wait to fail in e2e). Signed-off-by: Jagjeevan Kashid <jagjeevandev97@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Jagjeevan Kashid <jagjeevankashid97@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jagjeevan Kashid <jagjeevankashid97@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Jagjeevan Kashid <jagjeevankashid97@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Jagjeevan Kashid <jagjeevankashid97@gmail.com>
| browserUseKey := namespace + "/" + name | ||
| // TODO(hzxuzhonghu): make use of typed informer, so we don't need to do type conversion below | ||
| runtimeObj, exists, err := ifm.BrowserUseInformer.GetStore().GetByKey(browserUseKey) | ||
| if err != nil { | ||
| return nil, nil, fmt.Errorf("failed to get browser use %s from informer store: %w", browserUseKey, err) | ||
| } |
There was a problem hiding this comment.
GetByKey can return a non-nil error; currently it is ignored (_). This can mask informer/cache failures as a "not found" and make troubleshooting harder. Handle and return the error (wrapping it) before checking exists (similar to buildSandboxByCodeInterpreter).
|
|
||
| #### Why a Separate CRD? | ||
|
|
||
| The rationale mirrors the original decision to separate `CodeInterpreter` from `AgentRuntime` (see [runtime-template-proposal.md](./runtime-template-proposal.md)): |
There was a problem hiding this comment.
This document links to a local file:///Users/... path, which will be broken for anyone else reading the repo. Replace it with a repo-relative link (e.g., docs/design/runtime-template-proposal.md) or a URL that is valid in GitHub.
|
This PR is ready from my end, and it will serve as a good starting point for the initial version launch of this feature. CC: @YaoZengzeng, @acsoto, @tizhou86 |
Signed-off-by: Jagjeevan Kashid <jagjeevandev97@gmail.com>
|
Keywords which can automatically close issues and at(@) or hashtag(#) mentions are not allowed in commit messages. The list of commits with invalid commit messages:
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR introduces support for the
BrowserUsescenario within the AgentCube platform as a first-class citizen, mirroring theAgentRuntimeandCodeInterpreterpatterns. It provides a purpose-built declarative API for browser automation workloads.Changes:
BrowserUseCRD type.BrowserUselifecycle events (informer, builder, server/handlers).BrowserUseinvocation requests.SessionManager.docs/design.Which issue(s) this PR fixes:
Fixes #26
Special notes for your reviewer:
informers_test.goto prevent panics due to the new informer being uninitialized in tests.Does this PR introduce a user-facing change?: