Add a dedicated Chutes sidecar route without changing existing /v1/chat/completions behavior

## Summary
We want to integrate Chutes models into `vllm-proxy` while preserving the current production-stable behavior for existing users.

Instead of modifying the current `/v1/chat/completions` path, add a dedicated Chutes route:

- `POST /v1/chutes/chat/completions`
- `GET /v1/chutes/models`

This lets upstream gateways (e.g., Redpill) continue receiving standard OpenAI-style `/v1/chat/completions` requests and selectively forward Chutes-bound traffic to the new sidecar route.

## Goals
- Keep existing `/v1/chat/completions` behavior unchanged
- Minimize coupling/risk to current production path
- Add Chutes integration behind explicit route + config flag

## Proposed behavior
### Existing route (unchanged)
`POST /v1/chat/completions`
- continues to forward to local vLLM backend as today

### New Chutes route
`POST /v1/chutes/chat/completions`
- reuses existing E2EE parse/decrypt logic on ingress
- forwards plaintext request to Chutes OpenAI-compatible endpoint over TLS:
  - `${CHUTES_BASE_URL}/v1/chat/completions`
  - with `Authorization: Bearer ${CHUTES_API_KEY}`
- reuses existing E2EE encrypt logic on egress

### New Chutes models route
`GET /v1/chutes/models`
- proxies `${CHUTES_BASE_URL}/v1/models` with Chutes auth header

## Config
- `CHUTES_ENABLED` (default false)
- `CHUTES_BASE_URL` (default `https://llm.chutes.ai`)
- `CHUTES_API_KEY` (required if enabled)

## Security note
This is not pure client-to-model cryptographic E2EE.
It is a practical TEE-mediated segmented model:
- Client ↔ Proxy: existing E2EE
- Proxy ↔ Chutes: TLS
- Plaintext is only visible inside trusted runtime boundaries

## Acceptance criteria
1. Existing route behavior remains unchanged
2. Chutes route supports stream + non-stream
3. Existing E2EE nonce/replay checks remain in effect
4. Misconfig returns explicit 503 errors
5. No API-key leakage in logs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a dedicated Chutes sidecar route without changing existing /v1/chat/completions behavior #8

Summary

Goals

Proposed behavior

Existing route (unchanged)

New Chutes route

New Chutes models route

Config

Security note

Acceptance criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add a dedicated Chutes sidecar route without changing existing /v1/chat/completions behavior #8

Description

Summary

Goals

Proposed behavior

Existing route (unchanged)

New Chutes route

New Chutes models route

Config

Security note

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions