feat: Major features v1.3.0 - Cost Tracking, Caching, Multi-User, A/B Testing by murataslan1 · Pull Request #7 · starbaser/ccproxy

murataslan1 · 2025-12-19T02:59:09Z

Summary

This PR adds major new features to ccproxy v1.3.0, including cost tracking, request caching, multi-user support, and A/B testing framework. All features are fully tested with 356 tests passing.

New Modules

Module	Description
`metrics.py`	Cost tracking with budget alerts
`cache.py`	LRU request cache with TTL
`users.py`	Multi-user token/cost limits
`ab_testing.py`	Model comparison framework

Bug Fixes

OAuth graceful fallback when credentials missing
Router initialization race condition
Request metadata memory leak (TTL + max size)
Model reload thrashing with cooldown

Enhancements

ccproxy status --health for health metrics
ccproxy shell-integration for shell aliases
OAuth token background refresh
Configuration validation at startup
Global tokenizer cache for performance
Request retry with exponential backoff

Documentation

docs/troubleshooting.md - Common issues
docs/architecture.md - System design
docs/examples.md - Configuration examples
Updated README.md with all new features

Tests

356 tests passing
71% code coverage
New test files for all new modules

- Replace argparse with Tyro for type-safe CLI - Use dataclasses for configuration (ProxyConfig, GlobalConfig) - Implement decorator-based subcommand API to avoid 'command:' prefix - Maintain full feature parity with legacy CLI - Add ccproxy-legacy entry point for backwards compatibility - Add type stubs for tyro to ensure mypy compatibility - Update dependencies to include tyro>=0.7.0

- Removed ConfigProvider dependency injection pattern - Updated all usages to use get_config() singleton directly - Fixed thread safety with double-check locking pattern - Fixed all ruff and mypy pre-commit failures - Updated type stubs for external libraries - All tests passing with 92.96% coverage

- Remove Start, Stop, Status commands and CCProxyManager class - Add new LiteLLM command that wraps litellm with config file - Update tests to match new CLI structure - Fix httpx type stub (TimeoutError -> TimeoutException) - Maintain 92.43% test coverage The ccproxy CLI now provides: - `ccproxy litellm [args]` - Run LiteLLM with ccproxy configuration - `ccproxy install` - Install configuration files - `ccproxy run <command>` - Run commands with proxy environment

- Tyro converts PascalCase class names to kebab-case commands - LiteLLM was becoming lite-llm instead of litellm - Changed to Litellm which properly converts to litellm command

- Add detach parameter to Litellm dataclass with -d alias - Implement background process execution with PID file tracking - Save PID to config_dir/litellm.lock - Redirect stdout/stderr to config_dir/litellm.log (non-appending) - Check for existing running process before starting new one - Clean up stale PID files automatically - Add comprehensive tests for detach functionality - All tests passing with 92.08% coverage

- Add Stop dataclass for the stop command - Implement stop_litellm function that: - Checks for litellm.lock PID file - Attempts graceful shutdown with SIGTERM - Falls back to SIGKILL if needed after 0.5s - Cleans up stale PID files - Provides clear user feedback - Add comprehensive tests for all stop scenarios - All tests passing with 92.55% coverage

- Add `ccproxy logs` command to view LiteLLM log file - Support -f/--follow option for tail -f functionality - Support -n/--lines option to control number of lines shown - Use system PAGER for viewing logs (defaults to less) - Add comprehensive tests for all log viewing scenarios - Add rich type stubs for mypy compatibility

- Add defensive type checking to handle both float timestamps and timedelta objects - Fix TypeError when LiteLLM passes timedelta objects instead of float timestamps - Apply fix to async_log_success_event, async_log_failure_event, and async_log_stream_event - Add proper mypy type ignores for operator overloading on union types - Resolves runtime error: type datetime.timedelta doesn't define __round__ method

- Set CCPROXY_CONFIG_DIR environment variable in CLI before starting LiteLLM - Update config loading to check environment variable first, then fallback - Change fallback path from current directory to ~/.ccproxy - Fixes MatchModelRule not matching claude-3-5-haiku-20241022 model name - Resolves issue where proxy_server.config_path is None in LiteLLM runtime

- Add OAuth token forwarding logic in CCProxyHandler.async_pre_call_hook() - Only forward tokens when User-Agent contains 'claude-cli' - Only apply to Anthropic models (anthropic/* or claude*) - Extract OAuth token from secret_fields.raw_headers.authorization - Forward via provider_specific_header.extra_headers.authorization - Add comprehensive edge case handling with null checks - Update all tests to match actual implementation structure - Add tests for edge cases and missing data scenarios This enables Claude Code OAuth tokens (sk-ant-oat01-*) to be properly forwarded to Anthropic's API when using the LiteLLM proxy.

- Add ShellIntegration subcommand to CLI - Auto-detect shell type (bash/zsh) or allow explicit specification - Generate shell scripts that check if LiteLLM proxy is running - Dynamically create/remove 'claude' alias based on proxy status - Support automatic installation to shell config files - Use precmd_functions for zsh and PROMPT_COMMAND for bash - Check proxy status via PID file (litellm.lock) - Add comprehensive tests for shell integration - Update README with shell integration documentation This allows users to automatically have 'claude' aliased to 'ccproxy run claude' whenever the LiteLLM proxy server is running, and removes the alias when stopped.

- Remove YAML loading fallback, use litellm.proxy.proxy_server.llm_router directly - Remove _load_models_from_yaml and _get_fallback_model methods - Simplify fallback to use 'default' labeled model - Move time calculation logic to utils module (calculate_duration_ms) - Remove error message redacting as litellm handles it - Update all tests to mock proxy_server instead of YAML loading - Add test helpers for consistent proxy_server mocking - Fix subprocess.run env parameter in CLI tests - Add rich library type stubs for mypy - Maintain 93% test coverage with 194 passing tests

…nsistency - Remove dead code: ccproxy_get_model function was only used in tests - Refactor tests to use CCProxyHandler.async_pre_call_hook directly - Fix label inconsistency in test_handler_logging.py (large_context → token_count) - Maintain test coverage above 90% (achieved 92.36%) This simplifies the codebase by removing unnecessary backward compatibility and makes tests more direct by using the actual handler methods.

- Modified _determine_routed_model to properly use router's get_model_for_label - Removed hardcoded 'claude-3-5-sonnet-20241022' fallback - Changed fallback to 'unknown' when no model is specified - Added test for behavior when no 'default' model is configured - Fixed test import to include RuleConfig BREAKING CHANGE: When no 'default' label is configured and no rules match, the handler now preserves the original model instead of using a hardcoded fallback

OAuth tokens from Claude CLI are now forwarded based on whether the final routed model is going to the Anthropic provider, not based on the original request. This ensures OAuth tokens are properly forwarded when any model gets routed to Anthropic. - Changed handler.py to check routed_model instead of original_model - Updated test_oauth_forwarding_with_routed_model to verify forwarding - Added test_no_oauth_forwarding_when_routed_to_non_anthropic test

- Create hooks.py module with separate processing hooks: - classify_hook: Handles request classification - rewrite_model_hook: Routes to appropriate model based on label - forward_oauth_hook: Handles OAuth token forwarding for Claude CLI - Remove monolithic _determine_routed_model function - Simplify CCProxyHandler to use hook pipeline pattern - Improve OAuth forwarding to check actual API destination - Update metadata field names for clarity: - ccproxy_original_model → ccproxy_alias_model - ccproxy_routed_model → ccproxy_litellm_model - Add comprehensive tests for new OAuth forwarding logic BREAKING CHANGE: Metadata field names have changed. Update any code that relies on ccproxy_original_model or ccproxy_routed_model fields.

…ivate - Delete unused singleton.py module - Make clear_rules() private by renaming to _clear_rules() - Remove redundant reset_rules() method - Update tests to use private methods appropriately - Remove types.py reference from CLAUDE.md - Fix test mock specifications

- Remove OAuth tokens from logs to prevent credential exposure - Replace hard assertions with safe defaults and error logging - Add exception handling for hook execution to prevent request failures - Update input validation to handle edge cases gracefully Security improvements: - OAuth tokens no longer logged, only auth presence indicated - Failed hooks now logged but don't crash entire request - Invalid inputs handled with warnings and default values BREAKING CHANGE: Hook failures no longer raise exceptions, they log errors and continue processing. This may change error handling behavior for custom hooks that expect exceptions to propagate.

fix: address security, performance, and accuracy issues - Performance: add debug check for rich console output to reduce latency - Security: fix OAuth domain validation to prevent subdomain attacks - Accuracy: implement tiktoken-based token counting for better precision - Tests: update tests to work with realistic token counting behavior - Fix: improve exception handling and add tiktoken type stubs Fixes handler.py:94-127, hooks.py:74-83, rules.py:62-72 EOF )

- Add comprehensive documentation for config precedence order - Clarify priority: ENV > proxy dir > ~/.ccproxy (fallback) - Add structured logging for config source discovery - Improve error handling and debugging information Fixes config.py:166-205 unpredictable behavior issue

- Update test data to use realistic text patterns for accurate token counting - Replace simple repeated characters with varied sentences for proper tokenization - Add comprehensive tests for calculate_duration_ms utility function - Remove obsolete test_env.py and empty test_handler_temp.py files - Fix string formatting style in utils.py error message

Add --json flag to status command that outputs structured data: - proxy: boolean status - config: object with file paths - callbacks: array from litellm_settings - log: path string or null Also refactor status display to use headerless rich table and extract callbacks from config.yaml for both JSON and rich output.

Replace static ccproxy.py template with runtime generation to solve uv tool isolation issues. Handler file is now generated from config on each start, allowing custom handler classes and eliminating import errors when ccproxy and litellm are in separate environments. Changes: - Add generate_handler_file() function to parse handler config - Remove ccproxy.py from install template files - Generate handler before starting LiteLLM proxy - Add handler field to CCProxyConfig schema - Update tests to reflect auto-generation pattern - Add comprehensive handler generation test suite (7 new tests) - Update documentation for new workflow

Update documentation to reflect auto-generated handler workflow and requirement to install ccproxy with litellm bundled. Changes: - Add installation instructions with uv tool --with flag - Document that ccproxy.py is auto-generated on startup - Add Development Setup section with local workflow - Add Troubleshooting section for common import errors - Add handler configuration field documentation - Update prerequisites to explain environment requirements - Remove outdated manual setup instructions

- Remove incorrect --from flag with git URLs - Add PyPI as primary installation method - List GitHub installation as alternative - Fix all installation examples across README and docs

When ccproxy spawns litellm subprocess, it now uses the litellm executable from the same virtual environment instead of relying on PATH resolution. This prevents conflicts when multiple litellm installations exist (e.g., broken standalone tool in ~/.local/bin). The fix derives the litellm path from sys.executable, ensuring we always use the bundled version installed with --with flag. Fixes "ModuleNotFoundError: No module named 'backoff'" error that occurred when PATH found a broken standalone litellm installation.

Extend ccproxy to allow specifying a custom User-Agent header for each OAuth token source, enabling different clients to be identified and routed separately. - OAuthSource Pydantic model for flexible config - get_oauth_user_agent() method in CCProxyConfig - Auto-detect Claude Code version for user-agent - Comprehensive test coverage

Update all claude-opus-4-1-20250805 references to claude-opus-4-5-20251101 across documentation, templates, and tests.

- Add capture_headers hook for logging HTTP headers with sensitive redaction - Add SENSITIVE_PATTERNS for authorization, x-api-key, cookie redaction - Update forward_oauth to use new multi-provider OAuth system

- Replace deprecated 'credentials' with 'oat_sources' - Add capture_headers hook - Add example for extended OAuth config with user_agent

- Add hook parameter support via dict format in ccproxy.yaml - Hooks can now specify params: { headers: [...] } for hook-specific configuration - capture_headers hook now accepts optional headers filter parameter - ccproxy status command now displays hooks table with parameters - ccproxy status command now displays model deployments table from LiteLLM config - Model aliases (e.g., 'default') resolve to target model's API base for display

Eliminates the need to manually type '--' when running commands with flags: Before: ccproxy run -- claude -p foo After: ccproxy run claude -p foo The entry_point now intercepts argv and automatically inserts '--' after 'run' to prevent tyro from parsing command arguments as ccproxy flags. Backwards compatible - explicit '--' is still supported.

Remove tests that asserted old Anthropic-only OAuth forwarding behavior. The new implementation uses get_llm_provider for multi-provider support. Changes: - Remove TestCredentialsLoading (obsolete 'credentials' field) - Remove tests asserting OAuth NOT forwarded for non-Anthropic - Update credential fallback tests to use oat_sources API - Fix CLI tests to handle full litellm executable path - Fix test_multiple_providers to use passthrough mode

Add 22 tests covering: - Basic header capture with and without filtering - Case-insensitive header filtering - Sensitive header redaction (authorization, x-api-key, cookie) - Long header value truncation - HTTP method and path extraction - Raw headers from secret_fields merging - Edge cases (empty headers, missing metadata, etc.) Coverage: hooks.py 75% → 97%, total 78.65% → 81.55%

LiteLLM doesn't preserve custom metadata from async_pre_call_hook to logging callbacks. Implemented thread-safe global store to pass trace_metadata between callbacks, then update LangFuse traces directly via SDK in async_log_success_event.

Claude Code embeds session info in the metadata.user_id field with format: user_{hash}_account_{uuid}_session_{uuid} The extract_session_id hook parses this and sets: - metadata["session_id"] for LangFuse session grouping - trace_metadata["claude_user_hash"] and ["claude_account_id"]

Detect user's custom ccproxy.py files and skip auto-generation to avoid overwriting. Files containing "Auto-generated handler file" marker are safe to overwrite; all others are preserved with a warning panel. - Check for auto-generated marker before overwriting - Display rich warning panel with instructions - Suggest removing file for auto-generation or setting handler config - Add comprehensive tests for all scenarios

Enables external tools to detect ccproxy context by comparing ANTHROPIC_BASE_URL against the proxy URL, preventing infinite recursion in wrapper scripts.

Health checks are tagged with "litellm-internal-health-check" in metadata. When ccproxy's hooks rewrite model names, health checks validate the wrong model, causing failures. Now we detect health check requests early and return unmodified data, allowing LiteLLM to validate actual configured models.

Add critical note that the project name is `ccproxy` (lowercase), not "CCProxy". PascalCase is reserved for class names only.

Add ML artifacts and Prisma ignores from feat/logging and feat/mitm.

## New Features ### Core Modules Added - **Cost Tracking** (metrics.py): Per-request cost calculation, budget limits, alerts - **Request Caching** (cache.py): LRU cache with TTL, invalidation strategies - **Multi-User Support** (users.py): Per-user token/cost limits, model access control - **A/B Testing** (ab_testing.py): Model comparison with statistical analysis ### Bug Fixes - OAuth graceful fallback when credentials missing - Router initialization race condition fixed - Request metadata store memory leak fixed (TTL + max size) - Model reload thrashing fixed with cooldown - Default config now works out of the box ### Enhancements - Health check endpoint: ccproxy status --health - Shell integration: ccproxy shell-integration --shell [bash|zsh|fish] - OAuth token background refresh - Configuration validation at startup - Global tokenizer cache for performance - Request retry with exponential backoff ### Documentation - docs/troubleshooting.md: Common issues and solutions - docs/architecture.md: System design with ASCII diagrams - docs/examples.md: Configuration examples for various use cases - README.md updated with all new features ### Tests - 356 tests passing - New test files: test_metrics, test_cache, test_users, test_ab_testing, etc. - Coverage improved to 71%

starbaser · 2025-12-20T03:20:47Z

Before reviewing 6,000 lines: can you write up, in your own words, what problem you're solving and what value this adds?
For the bug fixes, please create separate issues with reproduction steps. Then open PRs referencing those issues.

murataslan1 · 2025-12-21T18:24:42Z

Hey,

You're right, 6k lines was way too much. My bad.

Been thinking about this - wanna break it into small PRs instead. Each one solving a specific thing.

First one would be config validation with helpful errors. Fixes that "default config doesn't work" issue. Pretty small change, maybe couple hundred lines, no breaking stuff. Basically when someone messes up the YAML they get an actual helpful message instead of a stack trace.

After that got some other stuff lined up (oauth race condition, connection pooling etc) but lets start small.

Want me to go ahead with the config one?

starbaser · 2026-01-06T18:13:40Z

Happy new year! actually yea config validation would be great, but the "default" behavior is what, normal claude code usage? users interacting through their llm harness may not be aware of adding ccproxy as a dep, add the alias from the readme and their harness stops working.

However, a config assist would be too useful to pass up for what's coming in the pre-release I'm going post today, it's all on the dev branch already. It has some important new features, including some that I would only talk about off-platform besides linking this Also, sorry, it likely will break your in-progress version.

murataslan1 · 2026-01-08T12:52:14Z

Hey @starbaser, thanks for the heads up! I've switched to the dev branch as suggested to avoid conflicts with the new features.

I've implemented the config validation logic we discussed. It now catches
yaml syntax errors and Pydantic validation errors, displaying them in a nice, readable red panel using rich (instead of crashing with a stack trace). I've verified it against valid and invalid configs.

I've pushed the changes to my fork. Should I open a new PR for this feature (feat/config-validation) targeting dev, or would you prefer to handle it differently?

starbaser added 30 commits July 31, 2025 12:02

some test scripts

8f57f7f

added tyro gitmcp

fe92cc7

some prep for tyro

fa7d5e4

tyro cli

4320bd5

config changes

90f2886

fix: change LiteLLM class to Litellm to fix tyro command parsing

04f558e

- Tyro converts PascalCase class names to kebab-case commands - LiteLLM was becoming lite-llm instead of litellm - Changed to Litellm which properly converts to litellm command

yay

c900a7d

oauth token working

0c7fa72

removed some generated project files that are no longer needed

5a0ffc2

prep for v1

1ec0431

starbaser and others added 25 commits November 18, 2025 11:26

docs: fix installation commands and prioritize PyPI

f3156c4

- Remove incorrect --from flag with git URLs - Add PyPI as primary installation method - List GitHub installation as alternative - Fix all installation examples across README and docs

discord link

11b2548

chore: update Opus model references to 4.5

747ab19

Update all claude-opus-4-1-20250805 references to claude-opus-4-5-20251101 across documentation, templates, and tests.

feat(hooks): add capture_headers hook and forward_oauth improvements

0dc01f3

- Add capture_headers hook for logging HTTP headers with sensitive redaction - Add SENSITIVE_PATTERNS for authorization, x-api-key, cookie redaction - Update forward_oauth to use new multi-provider OAuth system

chore: update ccproxy.yaml template to current format

3f24e42

- Replace deprecated 'credentials' with 'oat_sources' - Add capture_headers hook - Add example for extended OAuth config with user_agent

chore: bump version to 1.2.0

f9f96c5

feat(cli): add url field to status JSON output

b1ac10f

Enables external tools to detect ccproxy context by comparing ANTHROPIC_BASE_URL against the proxy URL, preventing infinite recursion in wrapper scripts.

docs: add python-extended standards import to CLAUDE.md

0aa46cb

docs: clarify project naming convention in CLAUDE.md

48c4150

Add critical note that the project name is `ccproxy` (lowercase), not "CCProxy". PascalCase is reserved for class names only.

docs: simplify README header and Discord link

39341ae

chore: merge gitignore updates from feature branches

c1c8763

Add ML artifacts and Prisma ignores from feat/logging and feat/mitm.

starbaser force-pushed the main branch from 0059375 to 83697b5 Compare May 25, 2026 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Major features v1.3.0 - Cost Tracking, Caching, Multi-User, A/B Testing#7

feat: Major features v1.3.0 - Cost Tracking, Caching, Multi-User, A/B Testing#7
murataslan1 wants to merge 112 commits into
starbaser:mainfrom
murataslan1:feature/major-features-v1.3.0

murataslan1 commented Dec 19, 2025

Uh oh!

starbaser commented Dec 20, 2025

Uh oh!

murataslan1 commented Dec 21, 2025

Uh oh!

starbaser commented Jan 6, 2026 •

edited

Loading

Uh oh!

murataslan1 commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

murataslan1 commented Dec 19, 2025

Summary

New Modules

Bug Fixes

Enhancements

Documentation

Tests

Uh oh!

starbaser commented Dec 20, 2025

Uh oh!

murataslan1 commented Dec 21, 2025

Uh oh!

starbaser commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

murataslan1 commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

starbaser commented Jan 6, 2026 •

edited

Loading