Skip to content

feat: add Tavily as configurable WebSearch provider#2

Open
tavily-integrations wants to merge 1 commit into
wenjiazhu1980:mainfrom
Tavily-FDE:feat/tavily-migration/deepcode-web-search-tavily
Open

feat: add Tavily as configurable WebSearch provider#2
tavily-integrations wants to merge 1 commit into
wenjiazhu1980:mainfrom
Tavily-FDE:feat/tavily-migration/deepcode-web-search-tavily

Conversation

@tavily-integrations

Copy link
Copy Markdown

Summary

Adds Tavily as an opt-in search backend for the WebSearch tool. When TAVILY_API_KEY is configured in settings env (or via DEEPCODE_TAVILY_API_KEY system env var), searches route through Tavily's REST API instead of the built-in deepcode.vegamo.cn endpoint.

The search priority order is preserved: custom script (webSearchTool) → Tavily (TAVILY_API_KEY) → built-in default. The existing provider and its machineId-based auth remain fully intact.

Files changed

  • src/tools/web-search-handler.ts — Added executeTavilyWebSearch() function that calls the Tavily /search REST API with advanced search depth and formats results as markdown. Inserted a priority check for TAVILY_API_KEY between the custom script check and the default provider fallback.
  • src/settings.ts — Added explicit TAVILY_API_KEY? field to DeepcodingEnv type for IDE discoverability.
  • docs/configuration.md — Documented TAVILY_API_KEY in the env sub-fields table and added a Tavily integration section under webSearchTool.
  • docs/configuration_en.md — English counterpart of the above documentation changes.

Dependency changes

  • None — uses native fetch against https://api.tavily.com/search to avoid adding a new npm dependency.

Environment variable changes

  • Added TAVILY_API_KEY — set in settings.json env block or as DEEPCODE_TAVILY_API_KEY system env var to activate the Tavily provider.

Notes for reviewers

  • No existing dependencies or env vars were removed (additive migration).
  • TypeScript type-check passes cleanly with zero errors.
  • The Tavily integration uses search_depth: "advanced" and max_results: 5 as sensible defaults. These could be made configurable via settings in a follow-up if needed.

Automated Review

  • Passed after 1 attempt(s)
  • Final review: The Tavily migration is well-structured and functionally correct. The additive approach preserves the existing custom-script and default search paths, slots Tavily cleanly into the priority chain, uses appropriate lifecycle hooks (onProcessStart/onProcessExit), includes proper error handling, updates the type system, and documents the new feature in both language variants. There are no critical or major issues — only a handful of minor style and consistency observations.

@wenjiazhu1980 wenjiazhu1980 self-assigned this Jun 15, 2026

@wenjiazhu1980 wenjiazhu1980 left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security & Correctness Review

Reviewed commit d24a454 (single commit, bot-authored via tavily-pr-agent@noreply.com). No high-severity / critical issues found — the code is additive, uses native fetch with a hardcoded HTTPS endpoint, and introduces no new dependencies. A few items worth addressing before merging:

✅ What's fine

  • No SSRF: endpoint is hardcoded https://api.tavily.com/search; the query is JSON-serialized into the body with no URL or command injection surface.
  • No dependency pollution: uses native fetch; the package-lock.json dependency tree is unchanged.
  • No secret leakage: the API key stays a local function param — it doesn't enter metadata or the activity label passed to onProcessStart (only query does).
  • No eval / shell / filesystem access — pure network + string formatting.

⚠️ Should fix before merge

1. [Medium] Auth method is outdated — use Bearer header, not api_key in body
The key is sent in the request body (api_key: apiKey). Per the current Tavily docs, authentication should be an Authorization: Bearer tvly-... header. The body method is legacy v1 — it may still work but is not recommended and could break on the next API iteration. ↴ (inline comment on the relevant line)

2. [Low/Correctness] package-lock.json version downgrade
This hunk bumps the lockfile from 0.1.270.3.5, but main is already at 0.3.6. Merging as-is would regress the lockfile version. Suggestion: drop this hunk on merge (git checkout --ours package-lock.json). ↴ (inline comment)

💡 Suggestions (non-blocking)

3. [Low] Untruncated upstream response echoed into error output
The raw upstream body flows into the tool error (and thus the LLM context / session log). Consider truncating. ↴ (inline comment)

4. [Low/UX] Silent switch from free to paid provider
Setting TAVILY_API_KEY automatically reroutes WebSearch away from the built-in free endpoint to the paid Tavily API (higher priority than the default). A user who already has this var set for another tool could incur unexpected charges. Consider documenting the billing impact explicitly, or gating on an explicit opt-in (e.g. webSearchProvider: "tavily").

5. [Supply chain note] Bot-authored vendor integration
Auto-generated by Tavily's PR bot — routes traffic to a paid endpoint, a form of vendor lock-in delivered as a PR. Code looks clean, but worth a human review on every update from this account.

Verdict

No blocking vulnerabilities. Recommend the author address #1 (auth) and #2 (lockfile) before merging; #3–#5 are improvements.

"Content-Type": "application/json",
},
body: JSON.stringify({
api_key: apiKey,

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] Auth method is outdated — use Bearer header, not api_key in body

Per the current Tavily docs, authentication should be sent as an Authorization: Bearer tvly-... header, not as a body field.

Sending api_key in the body is the legacy v1 approach. It may still work today but is not recommended and could break on the next API iteration. Additionally, body payloads are more likely to be logged by proxies and intermediaries than headers.

Recommended fix:

headers: {
  "Content-Type": "application/json",
  "Authorization": `Bearer ${apiKey}`,
},
body: JSON.stringify({
  query,
  max_results: TAVILY_DEFAULT_MAX_RESULTS,
  search_depth: TAVILY_DEFAULT_SEARCH_DEPTH,
  include_answer: true,
}),

return {
ok: false,
name: "WebSearch",
error: `Tavily search failed with status ${response.status}${body ? `: ${body}` : ""}`,

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low] Untruncated upstream response body echoed into error output

The raw upstream response body is interpolated directly into the error string, which flows into the LLM context and session logs.

Risks:

  • Response bodies can be very long and bloat the context window.
  • If Tavily's response ever contains reflective data (e.g. echo-back of headers or credentials), it could leak into persisted logs.

Suggested fix — truncate and sanitize:

const body = await response.text().catch(() => "");
const safeBody = body ? `: ${body.slice(0, 200)}` : "";
return {
  ok: false,
  name: "WebSearch",
  error: `Tavily search failed with status ${response.status}${safeBody}`,
};

Comment thread package-lock.json
{
"name": "@vegamo/deepcode-cli",
"version": "0.1.27",
"version": "0.3.5",

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low/Critical for merge] Version downgrade — main is already at 0.3.6

This hunk changes the lockfile version to 0.3.5, but main is currently at 0.3.6. Merging as-is would regress the lockfile version, causing a mismatch between package.json and package-lock.json.

Suggestion: drop this file's changes on merge (git checkout --ours package-lock.json) and regenerate the lockfile with npm install --package-lock-only if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants