Skip to content

Cross-compile scalagent to JVM (Scala.js + JVM artifacts)#44

Open
arcaputo3 wants to merge 4 commits into
mainfrom
tjc-1110-jvm-cross-compile
Open

Cross-compile scalagent to JVM (Scala.js + JVM artifacts)#44
arcaputo3 wants to merge 4 commits into
mainfrom
tjc-1110-jvm-cross-compile

Conversation

@arcaputo3
Copy link
Copy Markdown
Contributor

Summary

Adds a JVM target to scalagent's Mill build so it publishes both a Scala.js artifact (current behavior, unchanged for ndagent/finagent/memtest/example users) and a new JVM artifact suitable for the JVM/Scala binaries used by Claude Managed Agents (CMA)-backed agents.

This was driven by Phase 2b of the tjc-agents CMA refactor plan — a JVM-mode "echobot" agent that wires anthropic-sdk-java (JVM-only) and scalagent's A2A protocol surface into one binary. Live-fire validated end-to-end with echobot deployed to Modal: A2A request → CMA session → tool_use → webhook → Modal Sandbox → tool result → A2A reply.

What changed

Source layout: src/src/{shared,js,jvm}/. Existing flat tree restructured into a nested layout matching the tjc-agents pkgs/sc/agents/common/{js,jvm,shared} convention.

  • src/shared/ (~28 files): cross-built — A2A protocol types (A2AMessage, A2AResponse, AgentCard, JsonRpc, A2ATask, etc.), A2AServerTypes.scala (A2AServer trait, A2AEventPublisher, A2AEventStore, A2AReplayProvider, A2ATaskStore + InMemoryTaskStoreImpl, PushNotificationUrlPolicy refactored to java.net.URI), errors, json codecs, ID types, light config.
  • src/js/ (~107 files): Scala.js only — Claude Agent SDK adapters (Claude, ClaudeAgent, Codex, MCP server bindings, all 17 toRaw: js.Object Hook/Tool/Permission/Config adapters), Bun-runtime A2A files (A2AServer Bun impl + A2AServerLive factory, A2AInternals, A2AClient fetch-based, A2AV03, WorkspaceStaging), DSL builder, macros, streaming, hooks, permissions, session.
  • src/jvm/ (new): zio-http A2AServerLive (~250 LOC, minimum-viable; happy-path SendMessage + GetTask + ListTasks + agent-card REST endpoint) + A2AServerLive.Config (drops the Claude-SDK-adapter agentOptions / invocationPreparer fields).

Build: agent becomes a Module containing agent.js (BunScalaJSModule, current behavior) and agent.jvm (new ScalaModule). Both publish under artifactName scalagent with Maven cross-build suffixes (scalagent_sjs1_3 vs scalagent_3).

Refactors to make types cross-buildable without breaking JS callers:

  • A2ATypes: @JSGlobal(\"crypto\").randomUUID()java.util.UUID.randomUUID() (polyfilled on Scala.js, native on JVM)
  • PushNotificationUrlPolicy.externalOnly: js.Dynamic.newInstance(URL)(url)java.net.URI(url)
  • A2AServer.scala (1521L) split: trait + Config + auxiliary traits stay shared; private impls (A2AEventBus, A2ARuntimeRegistry, PushNotificationSender, EventStorePersister, ResultManager, A2ARequestHandler, A2AServerLiveImpl, BunServer) stay JS-only with A2AServerLive companion factory renamed from A2AServer.

Tests (new JVM-side):

  • A2AServerLiveSpec (2 tests): config equality, start/stop lifecycle
  • A2ATaskStoreSpec (1 test): InMemory implementation round-trip
  • PushNotificationUrlPolicySpec (2 tests): externalOnly rejects loopback/RFC-1918, allowAll passes
  • Existing test files moved into test/{shared,js,jvm}/. Total: 137 JVM test cases pass; 190 JS test cases pass (no regressions).

Mill version: bumps PUBLISH_VERSION default to 0.8.0-SNAPSHOT to reflect the new cross-build shape (existing artifact 0.7.0-RC2 was Scala.js only).

Cross-repo handoff

This branch was developed against tjc-agents TJC-1110 which:

  1. Bumps RuntimeVersions.Scalagent to 0.8.0-SNAPSHOT
  2. Adds RuntimeVersions.AnthropicJava = \"2.34.0\"
  3. New pkgs/sc/cma/ vendored module: CmaSession (trait + AnthropicOkHttpClient wrapper), CmaEvent (decoded Scala-friendly view of BetaManagedAgentsStreamSessionEvents), CmaA2AExecution (factory producing the executionOverride function)
  4. New pkgs/sc/agents/echobot/: minimal CMA-backed Scala agent; deployed to Modal as a JVM fat-jar via a hand-rolled deploy.py (codegen for agentLoopMode = ManagedAgents is Phase 2c)

The 4 existing scalagent-consuming agents (ndagent, finagent, memtest, example) bump transparently — their BunScalaJSModule resolver picks the _sjs1_3 artifact, unchanged behavior.

Live-fire result

Phase 2b echobot fully deployed + tool-using flow exercised:

```
USER: What is the current working directory? Run pwd then exit.
CLAUDE → bash: pwd
SANDBOX: /__modal/volumes/vo-m6bSlsg21H2aBdrV2q6qSv
CLAUDE: The current working directory is /__modal/volumes/vo-m6bSlsg21H2aBdrV2q6qSv.
```

End-to-end (~16s warm round-trip) through Modal A2A web tier → scalagent.jvm A2AServerLive → CmaA2AExecution → anthropic-java → CMA → webhook → Modal Sandbox → tjc.managed_agents.worker → bash exec → Claude follow-up message.

Out of scope (Phase 2c+)

  • SSE streaming on JVM A2AServerLive. Current impl buffers all executionOverride events and returns a synchronous TaskResult; message/subscribe returns 405 today.
  • Per-contextId CMA session reuse. Each A2A message opens a fresh CMA session — taskId/contextId don't yet map to a persistent session_id.
  • Full BetaManagedAgentsStreamSessionEvents taxonomy. CmaEvent decoder covers agent.message + session.status.idle + a few; the rest fall into Unhandled. Concrete agents will expand as needed.
  • DeployPyGen integration. Echobot's deploy.py is hand-rolled; auto-codegen for agentLoopMode = ManagedAgents agents lands in Phase 2c.
  • Cross-built tests beyond the minimal new JVM specs. Most existing scalagent tests mock JS SDK responses and stay JS-only.

Test plan

  • ./mill agent.js.compile clean (no regression)
  • ./mill agent.jvm.compile clean
  • ./mill agent.js.test passes 190 cases (no regression)
  • ./mill agent.jvm.test passes 137 cases
  • PUBLISH_VERSION=0.8.0-SNAPSHOT ./mill __.publishLocal produces both scalagent_sjs1_3 and scalagent_3 Maven artifacts in ~/.ivy2/local/
  • End-to-end live-fire with echobot deployed to Modal (chat-only + tools-enabled both work)

🤖 Generated with Claude Code

arcaputo3 and others added 4 commits May 26, 2026 13:55
Splits `agent` Mill module into `agent.js` + `agent.jvm`. Moves 17
hard-JS files (Claude SDK wrapper, Codex SDK, @a2a-js/sdk facades,
McpServer, AsyncIteratorOps, QueryStream, ZodFacade, ToolFiles) into
src-js/.

`mill agent.jvm.compile` still fails (232 errors) because 30+
shared-src files still use scala.scalajs.js (UndefOr / Dynamic /
Function / Promise). Exploration of the depth of contamination is
the next step before committing to the refactor scope.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 1b checkpoint. Both targets compile cleanly:
- `mill agent.js.compile` — Scala.js (Bun) artifact, unchanged behaviour
- `mill agent.jvm.compile` — new JVM artifact, 28 shared sources

Structural changes:
- Source layout: src/{shared,js,jvm}/; test/{shared,js,jvm}/.
- build.mill: agent.js (BunScalaJSModule) + agent.jvm (ScalaModule).
  Both publish artifactName "scalagent" with cross-build suffixes.
- A2AServer.scala split: A2AServerTypes.scala in shared carries the
  pure traits (A2AServer, A2AEventPublisher, A2AEventStore,
  A2AReplayProvider, PushNotificationUrlPolicy, A2ATaskStore +
  InMemoryTaskStoreImpl). The 1521-line Bun runtime impl stays in
  src/js/a2a/A2AServer.scala as A2AServerLiveImpl + A2AServerLive
  factory + JS-only Config.
- A2ATypes: java.util.UUID instead of @jsglobal("crypto"). Polyfilled
  on Scala.js, native on JVM.
- PushNotificationUrlPolicy.externalOnly: java.net.URI instead of
  js.Dynamic.newInstance(URL). Cross-build.
- A2AEventIds extracted to src/shared/a2a/ (pure Scala).

Files in src/shared/ (cross-built, ~28 files):
- a2a/ pure protocol types (A2AMessage, A2ARequest, A2AResponse, AgentCard,
  JsonRpc, ExecutionMode, A2AError, A2ATask, A2APushNotificationStore,
  A2ATypes, A2AEventIds, A2AServerTypes)
- errors/AgentError
- json/ codecs
- types/
- config/{Model, AgentModel, Effort, OutputStyle, PermissionMode,
  PositiveInt, PositiveDouble, ...} (the value types; the Claude-SDK
  adapter configs went to JS)

Files moved to src/js/ (~107 files):
- All Claude SDK adapters (Claude, ClaudeAgent, Codex/*, Hook*, ToolDef,
  AgentOptions, AgentDefinition, McpServerConfig, McpServer, etc.)
- Bun-runtime A2A files (A2AServer, A2AInternals, A2AClient, A2AV03,
  WorkspaceStaging, facade/*, A2ATypes-old)
- Macros (ToolMacros, ParamConverter, SchemaGen)
- Claude-loop infrastructure (core/, messages/, streaming/, hooks/,
  permissions/, session/, schema/, tools/, experimental/, interop/)
- DSL builder (BuilderConfig, AgentBuilder, TypedAgent, ToolSurface)
- Root package.scala (re-exports JS surfaces)

Out of scope: writing the JVM A2AServerLive (next task, ~300 LOC
zio-http). Publishing 0.8.0-SNAPSHOT. Tests cross-compile run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Minimum-viable JVM A2A server runtime for Phase 2b CMA agents. Mirrors
the JS A2AServerLive's request shape with a smaller surface — happy path
for synchronous message/send, plus tasks/get, tasks/list,
agent/getExtendedAgentCard, and a GET /.well-known/agent-card.json
REST endpoint.

src/jvm/com/tjclp/scalagent/a2a/A2AServerLive.scala:
- `A2AServerLive.Config` — JVM-only Config (no AgentOptions / no
  invocationPreparer; otherwise field-by-field parallel to the JS Config)
- `A2AServerLive.{create, start, live}` — factory methods matching the
  JS surface so callers don't reorganise
- `A2AServerLiveImpl` — private impl using:
  - `zio.http.Server.serve(Routes(POST /, GET /.well-known/agent-card.json))`
  - `CollectingPublisher` — in-memory `A2AEventPublisher` that buffers
    events from `executionOverride` and folds them into the final task
    (sync message/send semantics; SSE streaming deferred)
  - `taskStore` (in-memory default or caller-supplied) for tasks/get,
    tasks/list

Out of scope for this iteration:
- message/subscribe SSE streaming. The handler buffers all events and
  returns the final task; subscribe will be added later by mirroring
  the Bun A2AEventBus pattern.
- tasks/resubscribe, tasks/cancel
- Push notification config CRUD
- Per-task event store integration

These can be ported incrementally as JVM CMA agents need them. The
minimal surface is sufficient for Phase 2b echobot:
- Anthropic POSTs a message via the Go tjc CLI -> A2A POST /
- handler dispatches message/send to `executionOverride`
- executionOverride (in tjc-agents pkgs/sc/cma) talks CMA via
  anthropic-sdk-java, streams events back through the publisher
- handler folds events into the final task and returns

`mill agent.jvm.compile` clean; `mill agent.js.compile` unaffected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 26, 2026

Review: Cross-compile scalagent to JVM (PR #44)

Nice work splitting src/ into shared/js/jvm — the cross-build shape is clean and matches the tjc-agents convention. Live-fire validation against echobot/Modal is the right level of confidence for a Phase 2b cut. Most of the comments below are about hardening the new JVM A2AServerLive before it sees real-world traffic.

Critical / Security

  1. Config.maxRequestBodyBytes is silently unused. src/jvm/com/tjclp/scalagent/a2a/A2AServerLive.scala:40 declares it (default 1 MiB), but the route at :155 calls request.body.asString with no enforcement. A malicious client can post unlimited bodies and OOM the server. Either enforce it manually (request.body.asArray with a size check after read, or collectBoundedString) or configure zio-http's Server.Config request-body cap and drop the field.

  2. Config.pushNotificationUrlPolicy is silently unused. Default is externalOnly, but the JVM impl never calls validate. The field reads as load-bearing security but is a no-op today. Since push-notification CRUD is explicitly out of scope, I'd drop the field from Config until it's wired (or guard the default with a Scaladoc warning that it's currently inert). Same for taskTimeout, executionMode, eventStore, replayProvider, eventReplayLimit, eventStoreAppendTimeout, eventStoreLoadTimeout, pushNotificationStore — all present in Config, none read by the impl. Misleading API surface.

  3. tenant is client-supplied with no auth. A2ARequest.MessageSend.tenant (etc.) lets any client request data under any tenant string. Tenant isolation only works behind an auth layer that fills/overrides this server-side. Worth at least a Scaladoc warning on A2AServerLive.Config, ideally a Request => Option[String] hook for resolving tenant from headers/JWT.

Bugs / Correctness

  1. start returns success even when the server fails to bind. A2AServerLive.scala:133-143 forks Server.serve(...), immediately stores the fiber, and returns Task[Unit] — port-bind failures (BindException) only surface when the fiber crashes, which is observable only by fiber.await. Consider awaiting a readiness Promise populated from the server lifecycle, or interruptible-racing fork-result with a small startup deadline. Otherwise callers can't tell startup failed.

  2. Race between cancel and execution completion. In handleSendMessage (:223-237): load → fold events → conditional save-unless-canceled. If handleCancelTask runs between the load and the conditional save, the canceled status gets overwritten. Fix options: re-check under a per-task lock, or load-and-compare-and-swap before writing the final task.

  3. activeRuns.put registers after fork. :217-218: the fiber is added to activeRuns after run.fork. If cancel arrives in that window it can't interrupt — and combined with chore: Upgrade Scala to 3.7.4 #5, the executor will then write past the canceled status. Either pre-register a Promise[Fiber] before forking, or use Semaphore/lock around the start-of-execution path.

  4. val a :: b :: _ = nums: @unchecked at A2AServerTypes.scala:129 is brittle. The unchecked annotation silences the compiler; if parseIpv4 ever returns < 2 elements this becomes a runtime MatchError. Since parseIpv4 does always return 4-element lists today, prefer destructuring all four (case List(a, b, c, d) =>) so the invariant is type-enforced.

  5. InMemoryTaskStoreImpl.list filters/sorts timestamps as strings (:289): statusTimestampAfter.forall(after => task.status.timestamp.exists(_ >= after)). Works only if timestamps are canonical ISO-8601 (lex-orderable). Worth either a Scaladoc note or canonicalizing through OffsetDateTime.parse.

  6. Config.toAgentCard advertises http:// unconditionally (:42). Real deployments behind TLS (Modal/CMA) will advertise the wrong scheme in the agent card. Add an optional publicUrl (or scheme) override so the published AgentCard reflects the externally-reachable URL.

  7. Doc typo: A2AServerLive.scala:14 — "the JVM JVM scalagent build" (double JVM).

Test coverage

  1. No HTTP-layer tests on the JVM server. Both A2AServerLiveSpec tests call dispatchJsonRpc directly, bypassing the route handler at :151-168. The body-parsing / error-JSON-shaping / /.well-known/agent-card.json / malformed-JSON-RPC / method-not-found-envelope / oversized-body paths are uncovered. Even one happy-path round-trip via zio-http TestClient (or starting on an ephemeral port + java.net.http.HttpClient) would close the biggest gap.

  2. PushNotificationUrlPolicy IPv6 + RFC-1918 branches untested. PushNotificationUrlPolicySpec covers four numeric IPv4 loopback forms and one happy-path. The actual policy has logic for 100.64.0.0/10, 169.254.0.0/16, 172.16.0.0/12, 192.168.0.0/16, 198.18.0.0/15, multicast, fe80::, fc/fd, ff, and ::ffff: mapped addresses — none of which are covered. SSRF-defense is the whole point of the policy; off-by-one in CIDR membership checks is exactly the failure mode this code guards against. Adding rows to a parameterized test would be cheap.

  3. CLAUDE.md is out of date. It documents ./mill agent.test, ./mill agent.compile, ./mill agent.publishLocal. The aggregate agent.compile / agent.test aliases are preserved (good), but agent.publishLocal is gone — agent is now a Module, not a PublishModule. Either re-add an aggregate publishLocal task or update CLAUDE.md to point at agent.js.publishLocal / agent.jvm.publishLocal (or __.publishLocal).

Minor / Style

  1. PushNotificationUrlPolicy.externalOnly swallows the underlying parser exception (:103). Good for security (don't leak parser internals), but consider ZIO.logDebug of the cause for debuggability.

  2. serverFiberRef is a Ref.Synchronized (:106) but is only touched from two synchronous code paths. A plain Ref would suffice.

  3. InMemoryTaskStoreImpl.list sorts the full filtered list then takes a slice — fine for in-process tests, but worth a Scaladoc note so durable implementers don't inherit the behavior unintentionally.

What I liked

  • The shared/js/jvm layout is genuinely clean — moving A2AServerTypes (with the cross-built PushNotificationUrlPolicy) to shared/ while keeping the JS impls behind a package object alias preserves source compat for existing JS callers without leaking js.Dynamic into the JVM build.
  • A2APlatform.randomUUID() polyfill split is the right pattern.
  • BunPublishModule + shared PomSettings / sharedZioDeps / sharedPublishVersion keeps the two artifacts in lockstep without forcing a common trait through the BunScalaJS/Scala hierarchies.
  • Tenant-scoping test on the JVM dispatch layer is exactly the right sanity check for the new code path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant