-
-
Notifications
You must be signed in to change notification settings - Fork 119
WebSocket Protocol
The CortexPrism WebSocket provides real-time streaming chat, audio communication, file upload, and tool call reasoning inspection.
ws://127.0.0.1:3000/ws # Client WebSocket
ws://127.0.0.1:3000/ws/node # Node WebSocket (Hub ↔ Node)
Authentication: when webAuth.requireAuth is enabled, the /ws endpoint checks session cookies before upgrading connections.
{ "type": "chat", "message": "Hello", "sessionId": "sess_abc123", "files": [...] }
{ "type": "ping" }
{ "type": "new_session" }
{ "type": "select_agent", "agentId": "agent-1" }
{ "type": "audio_chunk", "data": "<base64>" }
{ "type": "audio_end" }
{ "type": "speak", "text": "Hello world" }| Field | Type | Required | Description |
|---|---|---|---|
type |
"chat" |
Yes | Message type |
message |
string | Yes | User message text |
sessionId |
string | No | Resume existing session |
files |
array | No | Uploaded files [{filename, mimeType, data (base64)}]
|
Files are received as base64 over WebSocket alongside chat messages. They are saved to both the working directory and agent workspace for tool access. PDFs get text auto-extracted. Images are included as multimodal content blocks for supported providers (Anthropic, Google Gemini). For text-only providers, a note is appended suggesting a provider switch.
{ "type": "connected" }
{ "type": "session", "sessionId": "sess_abc123" }
{ "type": "start" }
{ "type": "chunk", "delta": "Hello" }
{ "type": "reasoning", "content": "Agent is considering..." }
{ "type": "tool_call", "tool": "web_search", "args": {"query": "..."} }
{ "type": "tool_result", "tool": "web_search", "result": "..." }
{ "type": "done", "tokensIn": 100, "tokensOut": 50, "costUsd": 0.001, "durationMs": 800 }
{ "type": "error", "error": "Something went wrong" }
{ "type": "pong" }
{ "type": "audio", "data": "<base64 mp3>", "format": "mp3" }
{ "type": "voice_state", "listening": true, "enabled": true }
{ "type": "file_change", "path": "/workspace/file.ts" }| Field | Type | Description |
|---|---|---|
tokensIn |
number | Input tokens used |
tokensOut |
number | Output tokens generated |
costUsd |
number | Estimated cost in USD |
durationMs |
number | Total turn duration |
modelMode |
'manual' | 'auto' |
Model selection mode for this turn (v0.46+) |
requestedModelMode |
string | Requested mode (matches modelMode) |
resolvedProvider |
string | LLM provider used (v0.46+) |
resolvedModel |
string | LLM model used (v0.46+) |
autoFallback |
boolean | Whether Auto mode fell back to heuristic (v0.46+) |
autoFallbackReason |
string | Reason for fallback: 'mqm_disabled', 'low_confidence', etc. (v0.46+) |
The reasoning message type delivers the agent's internal decision-making process (tool selection rationale, task assessment) as a separate stream. In the Web UI, this appears in a collapsible panel toggled by a 🔬 Reasoning button that shows during tool use and auto-hides when the response completes.
Client → Server:
{ "type": "audio_chunk", "data": "<base64>" }
{ "type": "audio_end" }Server → Client:
{ "type": "speak", "text": "...", "voice": "alloy" }
{ "type": "audio", "data": "<base64>", "format": "mp3" }
{ "type": "voice_state", "listening": true, "enabled": true }Transcribed speech is dispatched directly into the agent loop as a user message. Auto-TTS synthesizes agent responses to audio before the done signal.
Include an existing sessionId in a chat message to resume across WebSocket reconnects and page reloads:
{ "type": "chat", "message": "Continue our conversation", "sessionId": "sess_abc123" }The server reopens the per-session database, reactivates the session, and loads previous messages via loadHistory(). Session titles are displayed in the chat header.
- Tool call XML (
<tool_call>) and bare JSON are stripped from chunks using a brace-depth walker algorithm at both server and client side - Streaming is buffered internally when tools are registered; only clean prose reaches the client
- Tool calls split across multiple WebSocket chunks are properly buffered and stripped
- The
file_changeevent broadcasts on file edits, renames, and deletes - WebSocket connections are upgraded from standard HTTP at
/ws - Node WebSocket at
/ws/nodeuses token-based registration with heartbeat/ACK protocol
- REST API — HTTP API endpoints
- Voice Pipeline — Audio streaming details
- Agent Loop — Turn processing and streaming mechanics
CortexPrism — Open-source AI agent operating system · Discord · Apache 2.0 License · Built with Deno 2.x + TypeScript
- Agent Loop
- Built-in Agents
- Metacognition
- Memory System
- Skills System
- Sub-Agents
- Built-in Tools
- Code Intelligence
- Code Sandbox
- Cross-Agent Context Protocol
- Prompt Lab
- PKM Assistant
- Voice Pipeline
- Computer Use
- Browser Tool
- Git & GitHub
- Scheduler & Jobs
- Dashboard
- Observability
- A2A Protocol
- MCP Gateway
- Distributed Nodes
- Memori Checkpoints
- Eval System
- Workflow Engine
- Triggers
- Projects
- TUI
- Glossary
- Update System
- Chrome Bridge
- Swarm
- AgentLint
- Model Benchmarking
- Smart Context
- Cost Optimizer