Skip to content

Feat/rate limit violation tracking claude opus#74

Open
kudroma404 wants to merge 4 commits into
mainfrom
feat/rate-limit-violation-tracking-claude-opus
Open

Feat/rate limit violation tracking claude opus#74
kudroma404 wants to merge 4 commits into
mainfrom
feat/rate-limit-violation-tracking-claude-opus

Conversation

@kudroma404

@kudroma404 kudroma404 commented Mar 10, 2026

Copy link
Copy Markdown
Contributor

Rate Limit Violation Tracking

Summary

Adds batched rate-limit violation tracking that accumulates per-client violations in memory and periodically flushes them to a dedicated PostgreSQL table. This avoids spamming the database with individual violation records on hot paths.

Motivation

Previously, rate limit violations were only observable through HTTP 429 responses. There was no persistent record of how often limits were hit, by which clients, or through which limiters. This makes it difficult to tune rate limits, detect abuse patterns, or understand capacity pressure across gateway instances.

Design

The implementation follows the existing EventRecorder / EventSink pattern already used for activity and worker events:

  • In-memory accumulator (RateLimitViolationTracker) -- uses scc::HashMap with per-bucket locking for contention-free record() calls on hot request paths. A background task drains the map on a configurable interval (default 60s).
  • Sink trait (RateLimitViolationSink) -- abstracts persistence so tests can use InMemoryViolationSink or Noop without a database.
  • Database table (rate_limit_violations) -- each flush produces one row per gateway instance with gateway_name, total_count, and a details JSONB column mapping "client_id:limiter_name" to count.

Integration points

Violations are captured at both rate-limiting layers:

  1. Per-IP Salvo middleware -- after each limiter.handle() call, the response status is checked; if 429, the violation is recorded with the source IP and limiter name (basic, add_result, load, leader, metric, status, update_key, unauthorized_only, read).
  2. Distributed subject-based limiter -- when check_and_incr returns false in check_subject_limit, the violation is recorded with the subject type and ID before returning TooManyRequests.

Changes

  • dev-env/init-scripts/init-schema.sql -- new rate_limit_violations table with index on (gateway_name, created_at DESC)
  • Cargo.toml -- added with-serde_json-1 feature to tokio-postgres for JSONB parameter support
  • src/config.rs -- added violation_flush_interval_sec field to DbConfig (default 60s)
  • src/db/mod.rs -- RateLimitViolationSink trait, InMemoryViolationSink, ViolationSinkHandle enum, ViolationRow
  • src/db/repository.rs -- Database::record_violations_batch INSERT query
  • src/db/violation_tracker.rs -- new file: RateLimitViolationTracker struct with accumulator + periodic flusher + 5 unit tests
  • src/http3/rate_limits.rs -- record_ip_violation_if_limited helper; hooks in all 9 per-IP handlers and check_subject_limit
  • src/http3/state.rs -- violation_tracker field + accessor on HttpState
  • src/http3/server.rs -- pass tracker through Http3Server::run
  • src/raft/mod.rs -- create tracker at startup alongside EventRecorder
  • src/test_support.rs -- accept ViolationSinkHandle in build_shared_harness_core
  • tests/client_http_api/support.rs -- ViolationTestHarness + build_harness_with_violation_tracking
  • tests/client_http_api/rate_limit_violations.rs -- new file: 4 integration tests
  • tests/event_tracker/support.rs -- pass ViolationSinkHandle::Noop to updated harness builder
  • dev-env/config/*.toml -- added violation_flush_interval_sec = 60 to all dev configs

Database migration

Run manually on existing deployments before deploying:

CREATE TABLE IF NOT EXISTS rate_limit_violations (
  id BIGSERIAL PRIMARY KEY,
  gateway_name VARCHAR(255) NOT NULL,
  total_count BIGINT NOT NULL,
  details JSONB NOT NULL DEFAULT '{}',
  created_at TIMESTAMP WITHOUT TIME ZONE DEFAULT (NOW() AT TIME ZONE 'UTC')
);
CREATE INDEX IF NOT EXISTS idx_rlv_gateway_created
  ON rate_limit_violations(gateway_name, created_at DESC);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant