Add DSPy RLM based research agent by wallyqs · Pull Request #13 · synadia-ai/synadia-agents

wallyqs · 2026-04-24T15:36:08Z

Add example agent of using web search and RLMs for research:

RESEARCH_DEBUG=1 RESEARCH_PROVIDER=exa RESEARCH_MODEL="openai/gpt-oss-120b" ./scripts/run.sh

synadia-claude-reviewer · 2026-04-24T15:36:45Z

Claude finished @wallyqs's task in 4m 5s —— View job

PR Review: Add DSPy RLM based research agent

Gather context (read PR diff, changed files, CLAUDE.md)
Review src/index.ts — correctness, NATS protocol compliance, error handling
Review src/search.ts — security, correctness
Review scripts/run.sh and scripts/ask.ts — sandbox soundness
Review package.json / tsconfig.json — dependency hygiene
Post inline comments on specific issues
Write overall review summary

Summary

The RLM wiring is solid — AxJSRuntime with only TIMING permission keeps the sandbox tight, the checkpointed/balanced context policy is appropriate for long research runs, and the stream terminator (msg.respond("") in a finally block) is correctly spec-compliant. The NVIDIA-vLLM scrub workarounds (parallel_tool_calls, reasoning_effort rewrite) are well-isolated in scrubbedFetch.

A few real issues to address before merging:

Security / correctness (should fix):

URL scheme not validated — web.fetch and web.findSimilar pass LLM-generated URLs directly to Tavily/Exa without checking for http(s)://. A prompt-injection-compromised actor could direct those providers to attempt arbitrary URLs. Fix → (see inline comment on search.ts:67)
USER env var as NATS subject token — if USER contains . (e.g. john.doe), the subject becomes agents.research.john.doe.rlm — five tokens instead of four, breaking discovery. Sanitize or validate on startup. (see inline comment on index.ts:198)
Module-level turn counter — shared across concurrent requests; concurrent debug files collide. Scope the counter per-request or use a random prefix. (see inline comment on index.ts:39)

Protocol compliance:

No error chunk type — mid-stream errors arrive as {"type":"response","data":"[agent error] …"}, which clients cannot distinguish from valid response text. Define and emit a {"type":"error","data":"…"} chunk type and document it in the README. (see inline comment on index.ts:293)

Minor (good to fix, not blocking):

Stale Content-Length on scrubbed response — res.headers is forwarded verbatim onto the re-serialised body; strip content-length from the rebuilt Response. (see inline comment on index.ts:85)
DEBUG files world-readable in /tmp — use { mode: 0o600 } in fs.writeFile or write to a per-process private directory. (see inline comment on index.ts:65)
run.sh always warns about TAVILY_API_KEY even when RESEARCH_PROVIDER=exa. (see inline comment on run.sh:22)

synadia-claude-reviewer · 2026-04-24T15:39:56Z

+  }
+
+  async fetch(url: string): Promise<FetchedPage> {
+    const res = await fetch("https://api.tavily.com/extract", {


URL scheme not validated before delegating to Tavily/Exa

url here originates from the LLM-generated REPL code. There's no guard to reject non-http(s):// schemes (e.g. file://, ftp://, or an attacker-injected URL via a compromised search result). The same applies to ExaProvider.fetch() (line 125) and ExaProvider.findSimilar() (line 146).

Even though Tavily/Exa make the actual outbound request (not this process), arbitrary URLs are still forwarded to those third-party services — potential for prompt-injection-driven SSRF via the provider.

Suggested fix — add a guard in each provider's fetch/findSimilar, or centrally in the index.ts tool handlers:

if (!/^https?:\/\//i.test(url)) throw new Error(`fetch: only http/https URLs are supported, got: ${url.slice(0, 80)}`);

synadia-claude-reviewer · 2026-04-24T15:39:59Z

+// non-reasoning models (ax rejects non-string `thought`) and refuses parallel
+// tool calls on some llama variants. Same scrub as examples/dspy/src/index.ts.
+const DEBUG_TRACE = process.env["RESEARCH_DEBUG"] === "1";
+let turn = 0;


Module-level turn counter races across concurrent requests

turn is a global, so two simultaneous promptHandler invocations will interleave increments. In debug mode this produces wrong pairings of research-req-N.json ↔ research-res-N.json, and in principle the scrub-fetch wrapper reads a file it didn't write. Move the counter inside scrubbedFetch as a closure variable scoped to the NATS handler invocation, or use a per-request unique prefix (e.g. a short uuid) for the debug file names.

synadia-claude-reviewer · 2026-04-24T15:40:09Z

+      // non-JSON body
+    }
+  }
+  if (DEBUG_TRACE && finalInit?.body) {


DEBUG traces written to world-readable /tmp with predictable filenames

/tmp/research-req-N.json contains the full request body sent to the NVIDIA endpoint — including the bearer token in the Authorization header if ax places it there, plus the user's research question and any accumulated REPL context. /tmp/research-res-N.json contains raw search results that may include scraped page content.

On a multi-user machine these files are readable by any local user (/tmp has sticky bit but files are created with the process's default umask, typically 0644). Suggestions:

Use fs.writeFile(path, data, { mode: 0o600 }) to restrict permissions, or

Write to a private directory under $XDG_RUNTIME_DIR / os.tmpdir() with a random suffix.

synadia-claude-reviewer · 2026-04-24T15:40:12Z

+const agentService = new ReferenceAgent({
+  nc,
+  agent: "research",
+  owner: process.env["USER"] ?? "anon",


USER env var used unsanitized as a NATS subject token

NATS subjects use . as a delimiter; * and > are wildcards. If USER contains any of those characters (e.g. john.doe, or is empty and falls through to "anon" on a headless runner), the resulting subject agents.research.<USER>.rlm becomes malformed or ambiguous.

Consider validating/sanitizing the value:

const rawOwner = process.env["USER"] ?? "anon"; const owner = rawOwner.replace(/[.*>]/g, "_");

Or reject at startup with a clear error message if the value is invalid.

synadia-claude-reviewer · 2026-04-24T15:40:22Z

+        citations.forEach((u, i) => chunk("response", `\n[${i + 1}] ${u}`));
+      }
+    } catch (err) {
+      const message = err instanceof Error ? err.message : String(err);


Mid-stream errors sent as response type — no error envelope type defined

When researcher.forward() rejects, the error string is wrapped as {"type":"response","data":"[agent error] …"}. Clients have no reliable way to distinguish a genuine partial response from a terminal error.

The spec calls for a 500 error envelope for server errors (msg.respondError(500, ...)), but here the stream has already started ("ack" was sent on line 222), so respondError can no longer be used exclusively.

Consider adding an error chunk type to the streaming protocol and documenting it in the README:

// Instead of chunk("response", `\n[agent error] ${message}`) msg.respond(encoder.encode(JSON.stringify({ type: "error", data: message })));

This lets SDK clients (and future typed wrappers) surface errors cleanly without text-scanning response content.

synadia-claude-reviewer · 2026-04-24T15:40:27Z

+    }
+    return out;
+  };
+  return new Response(JSON.stringify(scrub(body)), {


Reusing res.headers on a reconstructed Response carries a stale content-length

JSON.stringify(scrub(body)) will almost always differ in byte-length from the original response body (the scrubber removes at least the reasoning_content keys). Forwarding res.headers as-is means the new Response carries a Content-Length that no longer matches the body. Most HTTP clients ignore Content-Length when they already have the full body, but it's technically incorrect and could surprise consumers that inspect headers.

Fix: strip or override Content-Length on the rebuilt response:

const newHeaders = new Headers(res.headers); newHeaders.delete("content-length"); return new Response(JSON.stringify(scrub(body)), { status: res.status, statusText: res.statusText, headers: newHeaders, });

synadia-claude-reviewer · 2026-04-24T15:40:33Z

+  exit 1
+fi
+
+if [[ -z "${TAVILY_API_KEY:-}" ]]; then


Warning always checks TAVILY_API_KEY regardless of selected provider

If the user sets RESEARCH_PROVIDER=exa and EXA_API_KEY, this warning fires anyway and may confuse them. The check should be conditioned on the selected provider, or warn generically that at least one of TAVILY_API_KEY/EXA_API_KEY is needed:

if [[ -z "${TAVILY_API_KEY:-}" && -z "${EXA_API_KEY:-}" ]]; then echo "note: neither TAVILY_API_KEY nor EXA_API_KEY is set — web search will be disabled" >&2 fi

Signed-off-by: Waldemar Quevedo <wally@nats.io>

synadia-claude-reviewer Bot reviewed Apr 24, 2026

View reviewed changes

Add DSPy RLM based research agent

bfb9287

Signed-off-by: Waldemar Quevedo <wally@nats.io>

wallyqs force-pushed the research-agent branch from 6f7511a to bfb9287 Compare May 8, 2026 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DSPy RLM based research agent#13

Add DSPy RLM based research agent#13
wallyqs wants to merge 1 commit into
mainfrom
research-agent

wallyqs commented Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wallyqs commented Apr 24, 2026

Uh oh!

synadia-claude-reviewer Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Add DSPy RLM based research agent

Summary

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

synadia-claude-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

synadia-claude-reviewer Bot commented Apr 24, 2026 •

edited

Loading