fix: harden multi-tenant resource management and quota enforcement#656
Merged
fix: harden multi-tenant resource management and quota enforcement#656
Conversation
…lti-tenant Two fixes to support multi-tenant (premium) mode: 1. Pipeline override: add set_pipeline_override() and get_active_pipeline() to router.py so the premium plugin can inject quota-check and usage-tracking steps into the agent pipeline. Previously, handle_inbound_message() always used DEFAULT_PIPELINE, bypassing all premium quota enforcement. 2. Singleton user reuse guard: in _get_or_create_user(), skip the "reuse sole existing user" path when settings.premium_plugin is set. Without this guard, the first new Telegram sender in a multi-tenant deployment would have their messages linked to an existing user's account (data leak). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Bound per-user store caches (SessionStore, MemoryStore) with LRU eviction to prevent unbounded memory growth under multi-tenant load - Accumulate input/output token counts across multi-round agent loops in AgentResponse so premium can enforce per-user token quotas - Add pluggable estimate quota hooks (check + increment) so premium can inject quota enforcement without modifying OSS code Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Harden OSS layer for multi-tenant SaaS usage by premium:
dictcaches inSessionStoreandMemoryStorewithOrderedDict-based LRU (cap: 256 entries). Prevents memory leaks when many tenants are active.input_tokens/output_tokensacross multi-round agent loops inAgentResponse, enabling premium to enforce per-user token quotas.set_estimate_quota_hooks(check, increment)API so premium can inject quota enforcement at the tool level without modifying OSS code.Companion PR: mozilla-ai/clawbolt-premium (fix/wire-premium-pipeline-and-ci)
Type
Checklist
uv run pytest -v)ruff check backend/ && ruff format --check backend/)AI Usage
🤖 Generated with Claude Code