Skip to content

fix(api): cap unbounded async concurrency with p-limit#1975

Open
anis-marrouchi wants to merge 2 commits into
hexabot-ai:mainfrom
anis-marrouchi:4-cap-unbounded-async-concurrency
Open

fix(api): cap unbounded async concurrency with p-limit#1975
anis-marrouchi wants to merge 2 commits into
hexabot-ai:mainfrom
anis-marrouchi:4-cap-unbounded-async-concurrency

Conversation

@anis-marrouchi

@anis-marrouchi anis-marrouchi commented Apr 30, 2026

Copy link
Copy Markdown
Collaborator

What changed?

Summary: Replaced unbounded Promise.all(array.map(async ...)) fan-out with p-limit in 6 API hotspots where input size is user/config-driven.

Why: Unbounded async loops can spike DB/socket/filesystem pressure under large payloads, causing event-loop contention and degraded tail latency. p-limit (already a dependency) caps concurrency with sensible per-domain defaults.

Hotspots capped:

File I/O type Concurrency cap
extension-json.loader.ts Filesystem traversal 8
translation.controller.ts DB upserts (findOneOrCreate) 10
Handler.ts Attachment storage 4
attachment-ability.guard.ts DB permission checks 10
memory-store.ts DB memory entry persistence 8
app.service.ts Socket room joins 15

How to review: Each file has a small, isolated change — pLimit(N) wrapping the existing .map() callback. No behavior or response payload changes.

Validation: Typecheck and lint pass. Existing behavior preserved — only execution concurrency is bounded.

Closes anis-marrouchi#4
Related upstream issue: #1863

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Performance & Stability
    • Introduced controlled concurrency across backend operations to improve reliability and prevent overload: throttled room subscriptions, authorization checks, attachment storage, translation refreshes, extension file scanning, and workflow data updates—yielding steadier performance under load.

@coderabbitai

coderabbitai Bot commented Apr 30, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds p-limit-based concurrency caps across several API hotspots so previously unbounded Promise.all fan-outs (sockets, DB checks, attachment storage, translation lookups, filesystem traversal, workflow memory updates) now run with bounded concurrency; request/response shapes and authorization logic remain unchanged.

Changes

Concurrency Limits (p-limit)

Layer / File(s) Summary
Dependency / Imports
packages/api/... (multiple files)
Adds p-limit imports where concurrency is limited.
Socket joins
packages/api/src/app.service.ts
subscribeToEntityRooms now wraps req.socket.join(room) with pLimit(15) instead of unbounded Promise.all.
Authorization checks
packages/api/src/attachment/guards/attachment-ability.guard.ts
isAuthorized and canActivate (GET branch) wrap permission lookups with pLimit(10) to cap concurrent DB auth calls.
Attachment storage
packages/api/src/channel/lib/Handler.ts
persistMessageAttachments uses pLimit(4) to limit concurrent attachmentService.store(...) calls.
Translation refresh (DB)
packages/api/src/i18n/controllers/translation.controller.ts
refresh wraps translationService.findOneOrCreate(...) calls with pLimit(10) and awaits Promise.all as before.
Filesystem traversal (i18n loader)
packages/api/src/i18n/loaders/extension-json.loader.ts
resolveExtensionTranslationFiles and recursive findI18nDirectories accept a shared fsLimit = pLimit(8) and run directory reads/recursion through it.
Workflow memory updates
packages/api/src/workflow/utils/memory-store.ts
updateStoreEntries applies pLimit(8) to per-entry deepMerge/updateStoreEntry calls and returns Object.fromEntries(updates) as before.
Manifest / Packaging
package.json (project-level references in diffs)
PR references p-limit usage and includes package manifest mentions across changed modules.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API
    participant Limiter as "p-limit (N)"
    participant DB
    participant FS
    participant Storage
    participant Socket

    Client->>API: Request triggers batched async work
    API->>Limiter: enqueue tasks (auth, joins, FS reads, stores, translations)
    Limiter--)API: allow up to N concurrent tasks
    API->>DB: permission checks / translation lookups (bounded)
    API->>FS: extension dir/file reads (bounded)
    API->>Storage: store attachments (bounded)
    API->>Socket: join rooms (bounded)
    DB-->>API: DB responses
    FS-->>API: file lists
    Storage-->>API: store confirmations
    Socket-->>API: join confirmations
    API-->>Client: aggregate results (same payload shape)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related issues

Poem

🐰 I bounded hops across the stream,
Tamed wild Promises into a team.
Sockets, files, and DB queues—
I paced the bustle, eased the queues.
A little hop, a steady dream.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: capping unbounded async concurrency with p-limit across multiple API hotspots.
Description check ✅ Passed The description follows the template structure, provides summary, rationale, implementation details, and validation information covering all key aspects.
Linked Issues check ✅ Passed All six hotspots from issue #4 are addressed with appropriate concurrency caps (8 for filesystem, 10 for DB checks, 4 for storage, 8 for memory entries, 15 for sockets), and behavior/payloads are preserved.
Out of Scope Changes check ✅ Passed All changes are scoped to the six identified hotspots with localized p-limit wrapping; no extraneous modifications to unrelated code paths or functionality.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/api/src/app.service.ts (1)

84-84: ⚡ Quick win

Use a named constant (with a short rationale) for the room-join cap.

The inline 15 is a tuning knob; extracting it (e.g., SOCKET_JOIN_CONCURRENCY) with a one-line comment will make future tuning safer and clearer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/api/src/app.service.ts` at line 84, Extract the magic number 15 into
a named constant (e.g., SOCKET_JOIN_CONCURRENCY) and replace the inline literal
when creating socketLimit (pLimit). Add a one-line comment explaining this is
the concurrency cap for joining rooms so it’s easy to tune later; update the
usage in the socketLimit assignment (where pLimit(15) is called) to
pLimit(SOCKET_JOIN_CONCURRENCY).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/api/src/i18n/loaders/extension-json.loader.ts`:
- Around line 190-203: The current findI18nDirectories function creates a new
pLimit(8) (fsLimit) on each recursive call which allows uncontrolled total
concurrency; refactor so the limiter is created once and reused across
recursion—either add an optional parameter (e.g., fsLimit: ReturnType<typeof
pLimit>) to findI18nDirectories and pass the same limiter into recursive calls,
or make a shared class/module-level limiter used by findI18nDirectories; replace
the local pLimit(8) usage with the shared limiter wherever entries are mapped to
limiters so total concurrent readdir/stat operations stay globally capped at 8.

---

Nitpick comments:
In `@packages/api/src/app.service.ts`:
- Line 84: Extract the magic number 15 into a named constant (e.g.,
SOCKET_JOIN_CONCURRENCY) and replace the inline literal when creating
socketLimit (pLimit). Add a one-line comment explaining this is the concurrency
cap for joining rooms so it’s easy to tune later; update the usage in the
socketLimit assignment (where pLimit(15) is called) to
pLimit(SOCKET_JOIN_CONCURRENCY).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5d690c50-1abe-412b-8857-b9a73fd0b7da

📥 Commits

Reviewing files that changed from the base of the PR and between 8258744 and b9028f8.

📒 Files selected for processing (6)
  • packages/api/src/app.service.ts
  • packages/api/src/attachment/guards/attachment-ability.guard.ts
  • packages/api/src/channel/lib/Handler.ts
  • packages/api/src/i18n/controllers/translation.controller.ts
  • packages/api/src/i18n/loaders/extension-json.loader.ts
  • packages/api/src/workflow/utils/memory-store.ts

Comment thread packages/api/src/i18n/loaders/extension-json.loader.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/api/src/i18n/loaders/extension-json.loader.ts (1)

162-163: ⚡ Quick win

Extract and document the concurrency cap

Line 162 uses a bare 8. Please extract it to a named constant with a short tuning note to make future adjustments explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/api/src/i18n/loaders/extension-json.loader.ts` around lines 162 -
163, Extract the hard-coded concurrency cap (the literal 8 passed to pLimit when
creating fsLimit) into a named constant (e.g., FILE_READ_CONCURRENCY = 8) and
replace the magic number in the fsLimit = pLimit(…) call with that constant; add
a short inline comment above the constant explaining why the cap exists and a
tuning note for future adjustments (e.g., "tune based on IO throughput / system
limits"). Ensure references to fsLimit and pLimit remain unchanged elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/api/src/i18n/loaders/extension-json.loader.ts`:
- Around line 166-167: The current implementation acquires the same fsLimit
semaphore at both the parent mapping of extensionRootPaths and inside recursive
findI18nDirectories, which can deadlock when many parents occupy all slots
waiting on children; change the concurrency strategy by removing the outer
fsLimit wrapper around the recursive call sites and instead centralize semaphore
usage into low-level filesystem helpers (wrap fs.readdir/fs.stat/etc. with a
helper that calls fsLimit.run or similar) so findI18nDirectories and any callers
call those helpers without re-acquiring the semaphore; update references in the
code (fsLimit, findI18nDirectories, extensionRootPaths mapping) to use the new
fs-limited helpers for I/O only, keeping recursion itself semaphore-free.

---

Nitpick comments:
In `@packages/api/src/i18n/loaders/extension-json.loader.ts`:
- Around line 162-163: Extract the hard-coded concurrency cap (the literal 8
passed to pLimit when creating fsLimit) into a named constant (e.g.,
FILE_READ_CONCURRENCY = 8) and replace the magic number in the fsLimit =
pLimit(…) call with that constant; add a short inline comment above the constant
explaining why the cap exists and a tuning note for future adjustments (e.g.,
"tune based on IO throughput / system limits"). Ensure references to fsLimit and
pLimit remain unchanged elsewhere.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9e6b45d8-75e5-483e-89ad-7bc89b76142d

📥 Commits

Reviewing files that changed from the base of the PR and between b9028f8 and 3b92c1d.

📒 Files selected for processing (1)
  • packages/api/src/i18n/loaders/extension-json.loader.ts

Comment on lines +166 to 167
fsLimit(() => this.findI18nDirectories(rootPath, fsLimit)),
),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify nested limiter acquisition pattern in the changed file.
# Expected: matches both (1) top-level limited recursive call and (2) recursive limited child call.
fd -i 'extension-json.loader.ts' --exec rg -n -C2 \
'fsLimit\(\(\) => this\.findI18nDirectories\(rootPath, fsLimit\)\)|fsLimit\(async \(\) =>|return this\.findI18nDirectories\(entryPath, fsLimit\)'

Repository: Hexastack/Hexabot

Length of output: 599


🏁 Script executed:

sed -n '155,210p' packages/api/src/i18n/loaders/extension-json.loader.ts

Repository: Hexastack/Hexabot

Length of output: 1467


🏁 Script executed:

rg -n 'resolveExtensionTranslationFiles' packages/api/src/i18n/loaders/extension-json.loader.ts

Repository: Hexastack/Hexabot

Length of output: 191


🏁 Script executed:

sed -n '110,130p' packages/api/src/i18n/loaders/extension-json.loader.ts

Repository: Hexastack/Hexabot

Length of output: 775


🏁 Script executed:

rg -n 'extensionPaths' packages/api/src/i18n/loaders/extension-json.loader.ts

Repository: Hexastack/Hexabot

Length of output: 344


🏁 Script executed:

# Check how this loader is instantiated and what paths are typically provided
rg -n 'ExtensionJsonLoader' packages/api/src --type ts -A 3 | head -40

Repository: Hexastack/Hexabot

Length of output: 3201


Nested fsLimit acquisition in recursive traversal creates deadlock risk

The pattern at lines 166 and 197–204 acquires the same fsLimit semaphore across parent and child tasks. When extensionRootPaths has many entries or subdirectories are numerous, all 8 slots can be consumed by parent tasks waiting on children that need the same slots, blocking progress.

Consider limiting filesystem operations (e.g., fs.readdir) directly instead of wrapping entire recursive frames. This isolates I/O concurrency control from recursive call stack, preventing nested slot acquisition:

Suggested refactor
   private async resolveExtensionTranslationFiles(
     extensionRootPaths: string[],
   ): Promise<string[]> {
     const fsLimit = pLimit(8);
     const i18nDirectories = (
       await Promise.all(
-        extensionRootPaths.map((rootPath) =>
-          fsLimit(() => this.findI18nDirectories(rootPath, fsLimit)),
-        ),
+        extensionRootPaths.map((rootPath) =>
+          this.findI18nDirectories(rootPath, fsLimit),
+        ),
       )
     ).flat();
     // ... rest unchanged
   }

   private async findI18nDirectories(
     rootPath: string,
     fsLimit: ReturnType<typeof pLimit>,
   ): Promise<string[]> {
     const entries = await fsLimit(() =>
       fs.readdir(rootPath, { withFileTypes: true }).catch(() => null),
     );
     if (!entries) {
       return [];
     }
     const nestedResults = await Promise.all(
       entries
         .filter((entry) => entry.isDirectory())
         .map((entry) => {
           const entryPath = path.join(rootPath, entry.name);
           if (entry.name === 'i18n') {
             return Promise.resolve([entryPath]);
           }
           return this.findI18nDirectories(entryPath, fsLimit);
         }),
     );

     return nestedResults.flat();
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/api/src/i18n/loaders/extension-json.loader.ts` around lines 166 -
167, The current implementation acquires the same fsLimit semaphore at both the
parent mapping of extensionRootPaths and inside recursive findI18nDirectories,
which can deadlock when many parents occupy all slots waiting on children;
change the concurrency strategy by removing the outer fsLimit wrapper around the
recursive call sites and instead centralize semaphore usage into low-level
filesystem helpers (wrap fs.readdir/fs.stat/etc. with a helper that calls
fsLimit.run or similar) so findI18nDirectories and any callers call those
helpers without re-acquiring the semaphore; update references in the code
(fsLimit, findI18nDirectories, extensionRootPaths mapping) to use the new
fs-limited helpers for I/O only, keeping recursion itself semaphore-free.

anis-marrouchi and others added 2 commits May 2, 2026 10:42
Replace unbounded Promise.all fan-out with p-limit in 6 hotspots
to prevent burst load on DB/filesystem/socket backends:

- extension-json.loader: filesystem traversal (cap=8)
- translation.controller: DB upserts during refresh (cap=10)
- Handler: attachment storage (cap=4)
- attachment-ability.guard: permission checks (cap=10)
- memory-store: memory entry persistence (cap=8)
- app.service: socket room joins (cap=15)

Closes #4

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tories

Each recursive call was creating a new pLimit(8), defeating the global
concurrency cap. Pass the limiter as a parameter so all recursion levels
share one queue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@anis-marrouchi anis-marrouchi force-pushed the 4-cap-unbounded-async-concurrency branch from 3b92c1d to 7885d33 Compare May 2, 2026 09:43

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/api/src/attachment/guards/attachment-ability.guard.ts`:
- Around line 250-258: You are creating per-call pLimit instances so DB checks
still fan out; lift the concurrency limiter to the request scope and use the
same limiter for all permission checks: create one pLimit(10) (the existing
dbLimit) and stop isAuthorized from constructing its own pLimit per
invocation—either add a limiter parameter to isAuthorized and pass dbLimit
through (from the resourceRef.map branch) or move the DB-limited work into a
helper that accepts the shared dbLimit; ensure all calls that previously created
pLimit inside isAuthorized (and any callers) use this single shared dbLimit so
all resourceRef checks and nested DB operations share the same concurrency cap.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 97ec7985-a0eb-42ec-94e8-0373f66f23d9

📥 Commits

Reviewing files that changed from the base of the PR and between 3b92c1d and 7885d33.

📒 Files selected for processing (6)
  • packages/api/src/app.service.ts
  • packages/api/src/attachment/guards/attachment-ability.guard.ts
  • packages/api/src/channel/lib/Handler.ts
  • packages/api/src/i18n/controllers/translation.controller.ts
  • packages/api/src/i18n/loaders/extension-json.loader.ts
  • packages/api/src/workflow/utils/memory-store.ts
✅ Files skipped from review due to trivial changes (2)
  • packages/api/src/app.service.ts
  • packages/api/src/i18n/loaders/extension-json.loader.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/api/src/workflow/utils/memory-store.ts

Comment on lines +250 to 258
const dbLimit = pLimit(10);

return (
await Promise.all(
resourceRef.map((c) => this.isAuthorized(Action.READ, user, c)),
resourceRef.map((c) =>
dbLimit(() => this.isAuthorized(Action.READ, user, c)),
),
)
).every(Boolean);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Share one concurrency budget across the whole GET authorization flow.

isAuthorized() now creates a fresh pLimit(10) per call, and this branch adds another pLimit(10) around each resourceRef. With multiple refs in a single request, the permission checks still fan out per ref instead of sharing one cap, so the DB burst can remain much higher than intended. Please thread one limiter through both loops or lift it to the request scope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/api/src/attachment/guards/attachment-ability.guard.ts` around lines
250 - 258, You are creating per-call pLimit instances so DB checks still fan
out; lift the concurrency limiter to the request scope and use the same limiter
for all permission checks: create one pLimit(10) (the existing dbLimit) and stop
isAuthorized from constructing its own pLimit per invocation—either add a
limiter parameter to isAuthorized and pass dbLimit through (from the
resourceRef.map branch) or move the DB-limited work into a helper that accepts
the shared dbLimit; ensure all calls that previously created pLimit inside
isAuthorized (and any callers) use this single shared dbLimit so all resourceRef
checks and nested DB operations share the same concurrency cap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Issue] Cap unbounded async concurrency in API loops using p-limit

1 participant