Skip to content

add an in-memory LRU cache implementation and integrate it into exaSearch (cache checks before network, set on success). Expected outcome: short-lived cache keyed by request body, metrics for cache hits/misses, and eviction when capacity reached.#10

Open
rajarshidattapy wants to merge 3 commits into
AdityaP700:mainfrom
rajarshidattapy:cache-ratelimit

Conversation

@rajarshidattapy
Copy link
Copy Markdown
Contributor


name: "Exa Cache & Rate-limit"
about: Add short-lived cache and rate-limiting safeguards for Exa queries
title: "[Infra][Exa] Cache & Rate-limit"

✅ Pre-PR Checklist

Before submitting, please confirm:

  • I have run npm run lint / npm run test locally
  • I have run npm install if dependencies changed
  • I have NOT committed secrets or full API keys
  • I have documented required env vars and config below
  • I have tested cache and rate-limit behaviour locally

Description

Brief summary of the change being proposed:

  • What changed: added a short-lived in-memory LRU cache (TTL + capacity) to lib/exa-service.ts, and integrated cache lookups for Exa requests to reduce redundant API calls; added verbose logging hooks for cache hits/misses.
  • Why: reduce cost and latency for repeated identical Exa queries and reduce pressure on the Exa API by avoiding duplicate requests.

Files Changed

  • exora/lib/exa-service.ts — added cache helpers, cache integration, and config via env vars

Type of Change

Select all that apply:

  • Bug Fix
  • New Feature (caching)
  • Chore (infra, perf)
  • Documentation Update

Configuration / Environment Variables

Configurable env vars introduced or used:

  • EXA_CACHE_ENABLED (default: true) — enable/disable cache
  • EXA_CACHE_TTL_MS (default: 60000) — cache entry TTL in ms
  • EXA_CACHE_CAPACITY (default: 500) — max entries in in-memory LRU
  • VERBOSE_LOGS (optional) — enable detailed logs for cache hits/misses

Notes:

  • Cache is keyed by request body only (API keys are intentionally excluded from keys).
  • Cache is in-memory and per-process. For multi-instance deployments consider Redis-backed cache in a follow-up PR.

How Has This Been Tested?

Explain validation steps and results:

  • Unit tests (not included in this PR)
  • Manual testing:
    • Start dev server with VERBOSE_LOGS=true and EXA_CACHE_TTL_MS set low (e.g., 5000).
    • Trigger the same Exa-backed flow twice within TTL. Verify second request is served from cache (logs show cache hit).
    • Confirm behavior when cache disabled (EXA_CACHE_ENABLED=false) — second request should call Exa.

Manual test commands (PowerShell example):

cd exora
setx VERBOSE_LOGS "true"
setx EXA_CACHE_TTL_MS "5000"
npm run dev

Then hit the UI/route that triggers Exa twice and observe logs.


Security & Privacy

  • Do not log API keys or secret payloads. Cache keys explicitly exclude API keys.
  • Ensure BYOK keys provided by users are not written to logs or telemetry.

Rollback Plan

  • Revert this PR if any regressions are observed.
  • Temporarily disable cache with EXA_CACHE_ENABLED=false and redeploy.

Linked Issue

Closes #<issue_number> (if applicable)


Additional Context

  • For future work: add Redis-backed cache for multi-instance deployments and export cache metrics (hits/misses) to Prometheus or Datadog.

@vercel
Copy link
Copy Markdown

vercel Bot commented Oct 25, 2025

@rajarshidattapy is attempting to deploy a commit to the adityap700's projects Team on Vercel.

A member of the Team first needs to authorize it.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @rajarshidattapy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant performance and cost optimization by integrating a short-lived, in-memory LRU cache for Exa API requests. The primary goal is to minimize redundant network calls for identical queries, thereby reducing latency and API costs. The cache is highly configurable through environment variables, allowing fine-grained control over its behavior, and includes verbose logging to provide insights into its operation. Additionally, the project's README.md has been enhanced with details on CI processes.

Highlights

  • In-memory LRU Cache Implementation: An in-memory Least Recently Used (LRU) cache with a Time-To-Live (TTL) mechanism has been implemented to store responses from Exa API queries.
  • Cache Integration: The exaSearch function now checks the cache before making a network request and stores successful API responses in the cache, reducing redundant calls.
  • Configurable Cache Parameters: Cache behavior can be configured via environment variables, including EXA_CACHE_ENABLED, EXA_CACHE_TTL_MS (default 60s), and EXA_CACHE_CAPACITY (default 500 entries).
  • Verbose Logging: Detailed logging for cache hits and misses can be enabled by setting VERBOSE_LOGS=true, aiding in monitoring and debugging.
  • Documentation Update: The README.md file has been updated to include a new section on Continuous Integration (CI) practices and local testing commands.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/ci.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an in-memory LRU cache for Exa API requests, which is a great initiative to improve performance and reduce costs. The implementation is mostly solid, but I've identified a critical bug in the cache key generation that could lead to serving incorrect data. I've provided a detailed comment and a code suggestion to fix this. Additionally, I've made a couple of medium-severity suggestions to improve error handling and encourage adding unit tests for the new caching logic to ensure its robustness. The documentation changes in the README are clear and helpful.

Comment thread exora/lib/exa-service.ts
Comment on lines +18 to +48
function makeCacheKey(requestBody: object) {
try {
// Key by request body only (do not include API keys)
return Buffer.from(JSON.stringify(requestBody)).toString('base64')
} catch (e) {
return String(requestBody)
}
}

function getCache(key: string) {
if (!EXA_CACHE_ENABLED) return null
const entry = exaCache.get(key)
if (!entry) return null
if (Date.now() > entry.expiresAt) {
exaCache.delete(key)
return null
}
// LRU: move to end
exaCache.delete(key)
exaCache.set(key, entry)
return entry.value
}

function setCache(key: string, value: any) {
if (!EXA_CACHE_ENABLED) return
if (exaCache.size >= EXA_CACHE_CAPACITY) {
const oldest = exaCache.keys().next().value
if (oldest) exaCache.delete(oldest)
}
exaCache.set(key, { value, expiresAt: Date.now() + EXA_CACHE_TTL_MS })
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a critical issue in makeCacheKey that can lead to cache poisoning. The fallback return String(requestBody) will produce "[object Object]" for any object that fails JSON.stringify (e.g., due to circular references). This causes all such requests to share the same cache key, leading to incorrect cached responses being served.

To fix this, makeCacheKey should signal failure (e.g., by returning null), and getCache/setCache should be updated to handle this signal by skipping cache operations.

Here's a suggested implementation for all three functions that resolves this issue:

function makeCacheKey(requestBody: object): string | null {
  try {
    // Key by request body only (do not include API keys)
    return Buffer.from(JSON.stringify(requestBody)).toString('base64')
  } catch (e) {
    console.warn('[exa] Failed to generate cache key, skipping cache for this request.', e)
    return null
  }
}

function getCache(key: string | null) {
  if (!key || !EXA_CACHE_ENABLED) return null
  const entry = exaCache.get(key)
  if (!entry) return null
  if (Date.now() > entry.expiresAt) {
    exaCache.delete(key)
    return null
  }
  // LRU: move to end
  exaCache.delete(key)
  exaCache.set(key, entry)
  return entry.value
}

function setCache(key: string | null, value: any) {
  if (!key || !EXA_CACHE_ENABLED) return
  if (exaCache.size >= EXA_CACHE_CAPACITY) {
    const oldest = exaCache.keys().next().value
    if (oldest) exaCache.delete(oldest)
  }
  exaCache.set(key, { value, expiresAt: Date.now() + EXA_CACHE_TTL_MS })
}

Comment thread exora/lib/exa-service.ts
Comment on lines +10 to +16
// Simple in-memory cache for Exa queries (short-lived LRU with TTL)
const EXA_CACHE_ENABLED = process.env.EXA_CACHE_ENABLED !== 'false'
const EXA_CACHE_TTL_MS = Number(process.env.EXA_CACHE_TTL_MS || 60_000) // default 60s
const EXA_CACHE_CAPACITY = Number(process.env.EXA_CACHE_CAPACITY || 500)

interface ExaCacheEntry { value: any; expiresAt: number }
const exaCache = new Map<string, ExaCacheEntry>()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This new cache implementation is a great addition for performance. Since it contains non-trivial logic for TTL, LRU eviction, and key generation, it would be highly beneficial to add unit tests for it. This would help prevent regressions and verify edge cases, such as the cache capacity being reached or expired items being evicted correctly. The PR description mentions unit tests are not included, but I'd strongly recommend adding them for this critical piece of logic.

Comment thread exora/lib/exa-service.ts
return response.json();
const json = await response.json();
// store in cache for next requests
try { setCache(cacheKey, json) } catch (e) {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Silently catching and ignoring errors with an empty catch block (catch (e) {}) can hide bugs and make debugging difficult. While setCache as written is unlikely to throw errors, it's a good practice to at least log any unexpected errors that might occur, especially if the function's implementation changes in the future.

Suggested change
try { setCache(cacheKey, json) } catch (e) {}
try { setCache(cacheKey, json) } catch (e) { console.warn('[exa] Failed to set cache value.', e) }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant