From 3a653fce32dbdea1ea0b95e7b4332fc6eab21962 Mon Sep 17 00:00:00 2001 From: joshuaaferguson Date: Thu, 20 Nov 2025 11:25:40 -0700 Subject: [PATCH] feat: Introduce platform-agnostic architecture with control plane and distributed controllers, updating project overview and adding design specifications. --- .claude/MULTI_AGENT_PLAN.md | 203 ------------------------ .claude/multi-agent/MULTI_AGENT_PLAN.md | 105 ++++++++---- ANALYSIS_REPORT.md | 80 ++++++++++ CLAUDE.md | 140 ++++++++++++++-- README.md | 29 ++-- docs/ARCHITECTURE.md | 138 ++++++++-------- docs/CONTROLLER_SPEC.md | 101 ++++++++++++ 7 files changed, 464 insertions(+), 332 deletions(-) delete mode 100644 .claude/MULTI_AGENT_PLAN.md create mode 100644 ANALYSIS_REPORT.md create mode 100644 docs/CONTROLLER_SPEC.md diff --git a/.claude/MULTI_AGENT_PLAN.md b/.claude/MULTI_AGENT_PLAN.md deleted file mode 100644 index d9c2eb0b..00000000 --- a/.claude/MULTI_AGENT_PLAN.md +++ /dev/null @@ -1,203 +0,0 @@ -# StreamSpace Multi-Agent Orchestration Plan - -**Project:** StreamSpace - Kubernetes-native Container Streaming Platform -**Repository:** https://github.com/JoshuaAFerguson/streamspace -**Current Version:** v1.0.0 (Production Ready) -**Next Phase:** v2.0.0 - VNC Independence (TigerVNC + noVNC stack) - ---- - -## Agent Roles - -### Agent 1: The Architect (Research & Planning) -- **Responsibility:** System exploration, requirements analysis, architecture planning -- **Authority:** Final decision maker on design conflicts -- **Focus:** Phase 6 planning, integration strategies, migration paths - -### Agent 2: The Builder (Core Implementation) -- **Responsibility:** Feature development, core implementation work -- **Authority:** Implementation patterns and code structure -- **Focus:** Controller logic, API endpoints, UI components - -### Agent 3: The Validator (Testing & Validation) -- **Responsibility:** Test suites, edge cases, quality assurance -- **Authority:** Quality gates and test coverage requirements -- **Focus:** Integration tests, E2E tests, security validation - -### Agent 4: The Scribe (Documentation & Refinement) -- **Responsibility:** Documentation, code refinement, developer guides -- **Authority:** Documentation standards and examples -- **Focus:** API docs, deployment guides, plugin tutorials - ---- - -## Current Focus: Phase 6 - VNC Independence - -### Objective -Migrate from current VNC solution to TigerVNC + noVNC stack for complete open-source independence. - -### Success Criteria -- [ ] Zero proprietary VNC dependencies -- [ ] Maintain all existing features (hibernation, multi-user, persistence) -- [ ] Performance parity or better -- [ ] Smooth migration path for existing deployments -- [ ] Comprehensive documentation - ---- - -## Active Tasks - -### Task: Research VNC Migration Strategy -- **Assigned To:** Architect -- **Status:** Not Started -- **Priority:** High -- **Dependencies:** None -- **Notes:** Analyze current VNC implementation, evaluate TigerVNC/noVNC integration -- **Last Updated:** 2024-11-18 - Initial assignment - ---- - -## Communication Protocol - -### For Task Updates -```markdown -### Task: [Task Name] -- **Assigned To:** [Agent Name] -- **Status:** [Not Started | In Progress | Blocked | Review | Complete] -- **Priority:** [Low | Medium | High | Critical] -- **Dependencies:** [List dependencies or "None"] -- **Notes:** [Details, blockers, questions] -- **Last Updated:** [Date] - [Agent Name] -``` - -### For Agent-to-Agent Messages -```markdown -## [From Agent] → [To Agent] - [Date/Time] -[Message content] -``` - -### For Design Decisions -```markdown -## Design Decision: [Topic] -**Date:** [Date] -**Decided By:** Architect -**Decision:** [What was decided] -**Rationale:** [Why this approach] -**Affected Components:** [List components] -``` - ---- - -## StreamSpace Architecture Quick Reference - -### Key Components -1. **API Backend** (Go/Gin) - REST/WebSocket API, NATS event publishing -2. **Kubernetes Controller** (Go/Kubebuilder) - Session lifecycle, CRDs -3. **Docker Controller** (Go) - Docker Compose, container management -4. **Web UI** (React) - User dashboard, catalog, admin panel -5. **NATS JetStream** - Event-driven messaging -6. **PostgreSQL** - Database with 82+ tables -7. **VNC Stack** - Current target for Phase 6 migration - -### Critical Files -- `/api/` - Go backend -- `/k8s-controller/` - Kubernetes controller -- `/docker-controller/` - Docker controller -- `/ui/` - React frontend -- `/chart/` - Helm chart -- `/manifests/` - Kubernetes manifests -- `/docs/` - Documentation - -### Development Commands -```bash -# Kubernetes controller -cd k8s-controller && make test - -# Docker controller -cd docker-controller && go test ./... -v - -# API backend -cd api && go test ./... -v - -# UI -cd ui && npm test - -# Integration tests -cd tests && ./run-integration-tests.sh -``` - ---- - -## Best Practices for Agents - -### Architect -- Always consult FEATURES.md and ROADMAP.md before planning -- Document all design decisions in this file -- Consider backward compatibility -- Think about migration paths for existing deployments - -### Builder -- Follow existing Go/React patterns in the codebase -- Check CLAUDE.md for project context -- Write tests alongside implementation -- Update relevant documentation stubs - -### Validator -- Reference existing test patterns in tests/ directory -- Cover edge cases (multi-user, hibernation, resource limits) -- Test both Kubernetes and Docker controller paths -- Validate against security requirements in SECURITY.md - -### Scribe -- Follow documentation style in docs/ directory -- Update CHANGELOG.md for user-facing changes -- Keep API_REFERENCE.md current -- Create practical examples and tutorials - ---- - -## Git Branch Strategy - -- `agent1/planning` - Architecture and design work -- `agent2/implementation` - Core feature development -- `agent3/testing` - Test suites and validation -- `agent4/documentation` - Docs and refinement -- `main` - Stable production code -- `develop` - Integration branch for agent work - ---- - -## Coordination Schedule - -**Every 30 minutes:** All agents re-read this file to stay synchronized -**Every task completion:** Update task status and notes -**Every design decision:** Architect documents in this file -**Every feature completion:** Scribe updates relevant documentation - ---- - -## Project Context - -StreamSpace is a production-ready (v1.0.0) platform with: -- ✅ 82+ database tables -- ✅ 70+ API handlers -- ✅ 50+ UI components -- ✅ 15+ middleware layers -- ✅ Enterprise auth (SAML, OIDC, MFA) -- ✅ Compliance & security (DLP, RBAC, audit logging) -- ✅ 40+ Prometheus metrics -- ✅ Plugin system with 200+ templates - -**Next Phase:** VNC Independence - Migration to fully open-source stack - ---- - -## Notes and Blockers - -*This section for cross-agent communication and blocking issues* - ---- - -## Completed Work Log - -*Agents log completed milestones here for project history* diff --git a/.claude/multi-agent/MULTI_AGENT_PLAN.md b/.claude/multi-agent/MULTI_AGENT_PLAN.md index f119fd8b..27bd8b05 100644 --- a/.claude/multi-agent/MULTI_AGENT_PLAN.md +++ b/.claude/multi-agent/MULTI_AGENT_PLAN.md @@ -1,7 +1,7 @@ # StreamSpace Multi-Agent Orchestration Plan **Project:** StreamSpace - Kubernetes-native Container Streaming Platform -**Repository:** https://github.com/JoshuaAFerguson/streamspace +**Repository:** **Current Version:** v1.0.0 (Production Ready) **Next Phase:** v2.0.0 - VNC Independence (TigerVNC + noVNC stack) @@ -10,73 +10,90 @@ ## Agent Roles ### Agent 1: The Architect (Research & Planning) + - **Responsibility:** System exploration, requirements analysis, architecture planning - **Authority:** Final decision maker on design conflicts -- **Focus:** Phase 6 planning, integration strategies, migration paths +- **Focus:** Feature gap analysis, system architecture, review of existing codebase, integration strategies, migration paths ### Agent 2: The Builder (Core Implementation) + - **Responsibility:** Feature development, core implementation work - **Authority:** Implementation patterns and code structure - **Focus:** Controller logic, API endpoints, UI components ### Agent 3: The Validator (Testing & Validation) + - **Responsibility:** Test suites, edge cases, quality assurance - **Authority:** Quality gates and test coverage requirements - **Focus:** Integration tests, E2E tests, security validation ### Agent 4: The Scribe (Documentation & Refinement) + - **Responsibility:** Documentation, code refinement, developer guides - **Authority:** Documentation standards and examples - **Focus:** API docs, deployment guides, plugin tutorials --- -## Current Focus: Implementation Gap Analysis & Remediation +## Current Focus: Architecture Redesign - Platform Agnostic Controllers -### Reality Check -**IMPORTANT:** While documentation indicates StreamSpace is "production ready" with extensive features, many features are not yet fully implemented or functional. The documentation represents the vision, not current reality. +### Strategic Shift -### Primary Objective -Conduct comprehensive audit of actual vs documented features, then systematically implement missing functionality. +**Goal**: Transition from a Kubernetes-native architecture to a platform-agnostic "Control Plane + Agent" model. +**Reason**: To support multiple backends (Docker, Hyper-V, vCenter) and simplify the core API. ### Success Criteria -- [ ] Complete audit of codebase vs documentation -- [ ] Clear list of implemented vs missing features -- [ ] Prioritized implementation roadmap -- [ ] Working core features (sessions, templates, basic auth) -- [ ] Honest documentation reflecting actual state + +- [ ] **Phase 1**: Control Plane Decoupling (Database-backed models, Controller API) +- [ ] **Phase 2**: K8s Agent Adaptation (Refactor k8s-controller to Agent) +- [ ] **Phase 3**: UI Updates (Terminology, Admin Views) --- ## Active Tasks -### Task: Audit Codebase Reality vs Documentation -- **Assigned To:** Architect -- **Status:** Not Started -- **Priority:** CRITICAL -- **Dependencies:** None -- **Notes:** - - Compare FEATURES.md claims against actual code - - Check which API endpoints actually exist - - Verify which database tables are real vs planned - - Test which features actually work - - Document gaps honestly - - Create prioritized implementation plan -- **Last Updated:** 2024-11-18 - Initial assignment - -### Task: Identify Quick Wins -- **Assigned To:** Architect -- **Status:** Not Started -- **Priority:** High -- **Dependencies:** Audit completion -- **Notes:** Find features that are 80% done and can be quickly completed -- **Last Updated:** 2024-11-18 - Initial assignment +### Task: Phase 1 - Control Plane Decoupling + +- **Assigned To**: Builder +- **Status**: Not Started +- **Priority**: CRITICAL +- **Dependencies**: None +- **Notes**: + - Create `Session` and `Template` database tables (replace CRD dependency). + - Implement `Controller` registration API (WebSocket/gRPC). + - Refactor API to use DB instead of K8s client. +- **Last Updated**: 2025-11-20 - Architecture Redesign + +### Task: Phase 2 - K8s Agent Adaptation + +- **Assigned To**: Builder +- **Status**: Not Started +- **Priority**: High +- **Dependencies**: Phase 1 +- **Notes**: + - Fork `k8s-controller` to `controllers/k8s`. + - Implement Agent loop (connect to API, listen for commands). + - Replace CRD status updates with API reporting. +- **Last Updated**: 2025-11-20 - Architecture Redesign + +### Task: Phase 3 - UI Updates + +- **Assigned To**: Builder / Scribe +- **Status**: Not Started +- **Priority**: Medium +- **Dependencies**: Phase 1 +- **Notes**: + - Rename "Pod" to "Instance". + - Update "Nodes" view to "Controllers". + - Ensure status fields map correctly. +- **Last Updated**: 2025-11-20 - Architecture Redesign --- ## Communication Protocol ### For Task Updates + ```markdown ### Task: [Task Name] - **Assigned To:** [Agent Name] @@ -88,12 +105,14 @@ Conduct comprehensive audit of actual vs documented features, then systematicall ``` ### For Agent-to-Agent Messages + ```markdown ## [From Agent] → [To Agent] - [Date/Time] [Message content] ``` ### For Design Decisions + ```markdown ## Design Decision: [Topic] **Date:** [Date] @@ -108,6 +127,7 @@ Conduct comprehensive audit of actual vs documented features, then systematicall ## StreamSpace Architecture Quick Reference ### Key Components + 1. **API Backend** (Go/Gin) - REST/WebSocket API, NATS event publishing 2. **Kubernetes Controller** (Go/Kubebuilder) - Session lifecycle, CRDs 3. **Docker Controller** (Go) - Docker Compose, container management @@ -117,6 +137,7 @@ Conduct comprehensive audit of actual vs documented features, then systematicall 7. **VNC Stack** - Current target for Phase 6 migration ### Critical Files + - `/api/` - Go backend - `/k8s-controller/` - Kubernetes controller - `/docker-controller/` - Docker controller @@ -126,6 +147,7 @@ Conduct comprehensive audit of actual vs documented features, then systematicall - `/docs/` - Documentation ### Development Commands + ```bash # Kubernetes controller cd k8s-controller && make test @@ -148,24 +170,28 @@ cd tests && ./run-integration-tests.sh ## Best Practices for Agents ### Architect + - Always consult FEATURES.md and ROADMAP.md before planning - Document all design decisions in this file - Consider backward compatibility - Think about migration paths for existing deployments ### Builder + - Follow existing Go/React patterns in the codebase - Check CLAUDE.md for project context - Write tests alongside implementation - Update relevant documentation stubs ### Validator + - Reference existing test patterns in tests/ directory - Cover edge cases (multi-user, hibernation, resource limits) - Test both Kubernetes and Docker controller paths - Validate against security requirements in SECURITY.md ### Scribe + - Follow documentation style in docs/ directory - Update CHANGELOG.md for user-facing changes - Keep API_REFERENCE.md current @@ -196,6 +222,7 @@ cd tests && ./run-integration-tests.sh ## Audit Methodology for Architect ### Step 1: Repository Structure Analysis + ```bash # Check what actually exists ls -la api/ @@ -213,17 +240,20 @@ find . -name "*.jsx" -o -name "*.tsx" | wc -l For each feature claimed in FEATURES.md: **Check Code:** + - Does the API endpoint exist? - Is there a database migration for it? - Is there controller logic? - Is there UI for it? **Test Functionality:** + - Can you actually use this feature? - Does it work end-to-end? - Are there tests for it? **Document Status:** + ```markdown ### Feature: Multi-Factor Authentication (MFA) - **Claimed:** ✅ TOTP authenticator apps with backup codes @@ -246,24 +276,28 @@ For each feature claimed in FEATURES.md: ### Step 4: Prioritize Implementation **P0 - Critical Path (Must Work):** + - Core session lifecycle (create, view, delete) - Basic template system - Simple authentication - Database basics **P1 - Important (Make It Useful):** + - Session persistence - Template catalog - User management - Basic monitoring **P2 - Nice to Have (Enterprise Features):** + - SSO integrations - MFA - Advanced compliance - Plugin system **P3 - Future (Phase 6+):** + - VNC migration - Advanced features - Scaling optimizations @@ -281,6 +315,7 @@ Focus on making core features actually work before adding new ones. StreamSpace is an **ambitious vision** for a Kubernetes-native container streaming platform. The documentation describes a comprehensive feature set, but implementation is ongoing. **What Documentation Claims:** + - ✅ 82+ database tables - ✅ 70+ API handlers - ✅ 50+ UI components @@ -290,16 +325,18 @@ StreamSpace is an **ambitious vision** for a Kubernetes-native container streami - ✅ 200+ templates **Actual State (To Be Verified):** + - ⚠️ Some features fully implemented - ⚠️ Some features partially implemented - ⚠️ Some features not yet implemented - ⚠️ Documentation ahead of implementation **Architecture Vision:** + - **API Backend:** Go/Gin with REST and WebSocket endpoints - **Controllers:** Kubernetes (CRD-based) and Docker (Compose-based) - **Messaging:** NATS JetStream for event-driven coordination -- **Database:** PostgreSQL +- **Database:** PostgreSQL - **UI:** React dashboard with real-time WebSocket updates - **VNC:** Container streaming technology diff --git a/ANALYSIS_REPORT.md b/ANALYSIS_REPORT.md new file mode 100644 index 00000000..75d6c5a9 --- /dev/null +++ b/ANALYSIS_REPORT.md @@ -0,0 +1,80 @@ +# Architecture Redesign Analysis Report + +## Executive Summary + +The transition to a platform-agnostic architecture requires significant refactoring of the `api` and `k8s-controller` components. The `ui` is relatively decoupled but still contains Kubernetes-specific terminology and assumptions that need to be abstracted. + +The core challenge is moving from a **Kubernetes-Native** model (where the API talks directly to K8s) to an **Agent-Based** model (where the API talks to generic Controllers). + +## Component Analysis + +### 1. API Backend (`api/`) + +**Current State**: + +- Heavily coupled with Kubernetes via `k8s.io/client-go`. +- `internal/k8s/client.go` handles direct CRD operations. +- `internal/handlers/` assumes Session/Template CRDs exist in a cluster. +- `go.mod` has heavy K8s dependencies. + +**Required Changes**: + +- **Remove K8s Dependencies**: Strip `k8s.io/*` imports. +- **Abstract Data Model**: Replace CRD-based models with database-backed models for `Session` and `Template`. +- **Controller Management**: Implement a registry for Controllers (Agents) to register/connect. +- **Communication Layer**: Implement the secure WebSocket/gRPC server for Controllers to connect to. +- **Scheduler**: Implement a scheduler to decide which Controller should run a session (based on tags/resources). + +### 2. Kubernetes Controller (`k8s-controller/`) + +**Current State**: + +- Standard Kubebuilder controller. +- Watches CRDs and reconciles Pods/PVCs. +- Logic is tightly bound to the "Operator pattern" (watch loop). + +**Required Changes**: + +- **Refactor to Agent**: Change from "watching CRDs" to "listening to Control Plane". +- **Command Execution**: Implement handlers for `StartSession`, `StopSession`, etc., triggered by the Control Plane. +- **State Sync**: Instead of updating CRD status, report status back to the Control Plane via API. +- **Rename**: Move to `controllers/k8s/` and rename to `streamspace-agent-k8s`. + +### 3. Web UI (`ui/`) + +**Current State**: + +- Mostly consumes generic API endpoints. +- Some admin pages (`Nodes.tsx`) likely assume K8s nodes. +- Terminology like "Pod Name" is exposed in the UI. + +**Required Changes**: + +- **Terminology Update**: Rename "Pod" to "Instance" or "Container". +- **Admin Views**: Update "Nodes" view to show "Controllers" and their underlying resources. +- **Status Display**: Ensure status fields (Phase, URL) map correctly from the new generic model. + +## Migration Strategy + +1. **Phase 1: Control Plane Decoupling** + - Create the new database schema for Sessions/Templates. + - Update API to read/write to DB instead of K8s. + - Implement the Controller Registration API. + +2. **Phase 2: K8s Agent Adaptation** + - Fork `k8s-controller` to `controllers/k8s`. + - Replace the Manager/Reconciler loop with an Agent loop that connects to the new API. + +3. **Phase 3: UI Updates** + - Update the UI to reflect the new API response structures. + - Remove K8s-specific jargon. + +## Risk Assessment + +- **Complexity**: High. This is a rewrite of the core orchestration logic. +- **Compatibility**: Breaking change. Existing deployments will need a migration path (likely re-creating sessions). +- **Performance**: Moving from K8s watch events to Agent reporting might introduce latency in status updates. + +## Conclusion + +The redesign is feasible but requires a structured approach. The "Control Plane" needs to become the source of truth, rather than Kubernetes. The K8s Controller will become just one of many possible backends. diff --git a/CLAUDE.md b/CLAUDE.md index af469d9c..0b00aed4 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -27,22 +27,25 @@ This document provides comprehensive guidance for AI assistants working with the ## 📖 Project Overview -**StreamSpace** is a Kubernetes-native multi-user platform that streams containerized applications to web browsers using open source VNC technology. It provides on-demand provisioning with auto-hibernation for resource efficiency. +**StreamSpace** is a platform-agnostic multi-user platform that streams containerized applications to web browsers. It features a central Control Plane (API/WebUI) that manages distributed Controllers across various platforms (Kubernetes, Docker, Hyper-V, vCenter, etc.). -**Strategic Goal**: Build a 100% open source alternative to commercial container streaming platforms with complete independence from proprietary technologies. +**Strategic Goal**: Build a universal, open-source container streaming platform that runs anywhere, independent of the underlying infrastructure. ### Key Features + +- **Platform Agnostic**: Runs on Kubernetes, Docker, Hyper-V, vCenter, etc. +- **Agent-Based Architecture**: Controllers act as agents on target nodes. - Browser-based access to any containerized application - Multi-user support with SSO (Authentik/Keycloak) -- Persistent home directories (NFS) +- Persistent home directories (NFS/HostPath/Volume) - On-demand auto-hibernation for resource efficiency - 200+ pre-built application templates (LinuxServer.io catalog) - Resource quotas and limits per user - **Plugin system** for extending platform functionality - Comprehensive monitoring with Grafana and Prometheus -- Optimized for k3s and ARM64 architectures ### Project Status + - **Current Phase**: Phase 5 (Production-Ready) - ✅ COMPLETE - **Current Version**: v1.0.0 - **Next Phase**: Phase 6 (VNC Independence) - Migration to TigerVNC + noVNC @@ -50,7 +53,15 @@ This document provides comprehensive guidance for AI assistants working with the - **Branding**: Rebranded from "Workspace Streaming Platform" to "StreamSpace" - **Implementation**: 82+ database tables, 70+ API handlers, 50+ UI components, 15+ middleware layers +### Architecture Changes + +- **Control Plane**: Centralized API and WebUI. +- **Controllers**: Platform-specific agents (e.g., `streamspace-controller-k8s`, `streamspace-controller-docker`). +- **Communication**: Controllers connect to the Control Plane via secure API/WebSocket. +- **Resources**: Abstracted `Session` and `Template` models translated by controllers. + ### API Changes from Migration + - **Old API Group**: `workspaces.aiinfra.io/v1alpha1` - **New API Group**: `stream.space/v1alpha1` - **Old Resources**: WorkspaceSession, WorkspaceTemplate @@ -102,6 +113,7 @@ StreamSpace will become the leading open source alternative to commercial contai #### Phase 3: VNC Independence (CRITICAL) **Recommended VNC Stack**: + ``` ┌─────────────────────────────────────┐ │ Web Browser (User) │ @@ -173,6 +185,7 @@ StreamSpace will become the leading open source alternative to commercial contai - CAD/Engineering: FreeCAD, KiCad, OpenSCAD **Image Build Infrastructure**: + ```yaml # GitHub Actions workflow name: Build Container Images @@ -225,6 +238,7 @@ jobs: ### Code Patterns for VNC Abstraction **Good Pattern** (VNC-agnostic): + ```go type VNCConfig struct { Port int `json:"port"` @@ -241,6 +255,7 @@ func (t *Template) GetVNCPort() int { ``` **Bad Pattern** (Kasm-specific): + ```go // ❌ DON'T DO THIS type KasmVNCConfig struct { @@ -249,11 +264,13 @@ type KasmVNCConfig struct { ``` **Good Template Definition**: + ```yaml apiVersion: stream.space/v1alpha1 kind: Template metadata: name: firefox-browser + namespace: streamspace spec: vnc: # Generic VNC config enabled: true @@ -263,6 +280,7 @@ spec: ``` **Bad Template Definition**: + ```yaml # ❌ DON'T DO THIS spec: @@ -276,6 +294,7 @@ spec: Track progress toward full independence: **Phase 3 Tasks**: + - [ ] Research and select VNC stack (TigerVNC + noVNC) - [ ] Build proof-of-concept with open source VNC - [ ] Create base container images with TigerVNC @@ -290,6 +309,7 @@ Track progress toward full independence: - [ ] Security audit of new VNC stack **Completion Criteria**: + - Zero mentions of "Kasm" or "kasmvnc" in codebase - All container images built by StreamSpace - No external dependencies on proprietary software @@ -299,11 +319,13 @@ Track progress toward full independence: ### Reference Documentation For detailed migration plan, see: + - `ROADMAP.md` - Complete development roadmap - Phase 3 section for VNC migration details - Phase 6 for production readiness For technical architecture, see: + - `docs/ARCHITECTURE.md` - Current architecture - Future: `docs/VNC_MIGRATION.md` - VNC migration guide @@ -363,12 +385,10 @@ streamspace/ │ ├── PLUGIN_DEVELOPMENT.md # Plugin development guide │ -├── k8s-controller/ # Go Kubernetes controller using Kubebuilder -│ ├── cmd/ # Main entry point -│ ├── internal/ # Controller logic, reconcilers -│ ├── api/ # CRD type definitions -│ ├── config/ # Controller configuration -│ └── tests/ # Controller tests +├── controllers/ # Directory for platform-specific controllers +│ ├── k8s/ # Kubernetes controller (formerly `k8s-controller/`) +│ ├── docker/ # Docker controller (future) +│ └── hyperv/ # Hyper-V controller (future) │ ├── api/ # Go API backend (REST + WebSocket) │ ├── cmd/ # API server entry point @@ -408,11 +428,14 @@ streamspace/ - **`scripts/`**: Automation scripts for template generation and utilities -- **`k8s-controller/`**: Go-based Kubernetes controller (Kubebuilder) - - Manages Session lifecycle and hibernation - - Reconciles CRD resources with Kubernetes state +- **`controllers/`**: Directory for platform-specific controllers + - `k8s/`: Kubernetes controller (formerly `k8s-controller/`) + - `docker/`: Docker controller (future) + - `hyperv/`: Hyper-V controller (future) - **`api/`**: Go API backend (REST + WebSocket) + - Control Plane logic + - Controller management and communication - Authentication and session management - Plugin system backend - WebSocket proxy for VNC connections @@ -426,18 +449,21 @@ streamspace/ ### External Repositories StreamSpace uses separate repositories for templates and plugins to enable: + - Independent versioning and releases - Community contributions without main repo access - Flexible deployment (online/offline modes) - Multiple repository sources **Template Repository**: [streamspace-templates](https://github.com/JoshuaAFerguson/streamspace-templates) + - 22+ official application templates - Organized by category (browsers, development, design, etc.) - Auto-synced by API backend (configurable interval) - Catalog metadata for discovery **Plugin Repository**: [streamspace-plugins](https://github.com/JoshuaAFerguson/streamspace-plugins) + - Official and community plugins - Extension points for platform functionality - Auto-discovery via catalog @@ -448,6 +474,7 @@ StreamSpace uses separate repositories for templates and plugins to enable: ## 🛠 Key Technologies ### Core Stack + - **Kubernetes**: 1.19+ (k3s recommended for ARM64) - **Container Runtime**: Docker/containerd - **Storage**: NFS with ReadWriteMany support @@ -456,6 +483,7 @@ StreamSpace uses separate repositories for templates and plugins to enable: - **Database**: PostgreSQL (for user data, sessions, audit logs) ### Controller (✅ Implemented) + - **Language**: Go 1.21+ - **Framework**: Kubebuilder 3.x - **Client**: controller-runtime @@ -463,6 +491,7 @@ StreamSpace uses separate repositories for templates and plugins to enable: - **Status**: Production-ready with hibernation, session lifecycle, and user PVC management ### API Backend (✅ Implemented) + - **Framework**: Go with Gin framework - **Authentication**: Local, SAML 2.0, OIDC OAuth2, JWT, MFA (TOTP) - **WebSocket**: Real-time session updates and VNC proxy @@ -472,6 +501,7 @@ StreamSpace uses separate repositories for templates and plugins to enable: - **Integrations**: Webhooks (16 events), Slack, Teams, Discord, PagerDuty, email (SMTP) ### Web UI (✅ Implemented) + - **Framework**: React 18+ with TypeScript - **UI Library**: Material-UI (MUI) - **State Management**: React Context API @@ -482,12 +512,14 @@ StreamSpace uses separate repositories for templates and plugins to enable: - **Features**: Session management, plugin catalog, admin panel, real-time updates ### Application Streaming + - **VNC Server**: Currently KasmVNC (⚠️ TEMPORARY - will be replaced with TigerVNC + noVNC in Phase 3) - **Base Images**: Currently LinuxServer.io containers (⚠️ TEMPORARY - will be replaced with StreamSpace-native images in Phase 3) - **VNC Port**: 5900 (standard VNC) or 3000 (current LinuxServer.io convention) - **Target Stack**: TigerVNC server + noVNC client + WebSocket proxy (100% open source) ### Monitoring + - **Metrics**: Prometheus - **Dashboards**: Grafana - **Alerts**: PrometheusRule CRDs @@ -506,6 +538,7 @@ StreamSpace uses separate repositories for templates and plugins to enable: **Short Names**: `ss`, `sessions` **Key Fields**: + ```yaml apiVersion: stream.space/v1alpha1 kind: Session @@ -534,6 +567,7 @@ status: ``` **kubectl Examples**: + ```bash # List all sessions kubectl get sessions -n streamspace @@ -558,6 +592,7 @@ kubectl delete session user1-firefox -n streamspace **Short Names**: `tpl`, `templates` **Key Fields**: + ```yaml apiVersion: stream.space/v1alpha1 kind: Template @@ -599,6 +634,7 @@ spec: ``` **kubectl Examples**: + ```bash # List all templates kubectl get templates -n streamspace @@ -627,6 +663,7 @@ These exist for migration compatibility but should not be used in new code. **Goal**: Build the Go-based Kubernetes controller using Kubebuilder. **Prerequisites**: + - Go 1.21+ - Kubebuilder 3.x - Docker @@ -636,6 +673,7 @@ These exist for migration compatibility but should not be used in new code. **Implementation Steps**: 1. **Initialize Kubebuilder Project**: + ```bash mkdir -p controller cd controller @@ -652,21 +690,25 @@ kubebuilder create api --group stream --version v1alpha1 --kind Template ``` 2. **Define CRD Types**: + - Edit `api/v1alpha1/session_types.go` - Edit `api/v1alpha1/template_types.go` - Reference: `docs/CONTROLLER_GUIDE.md` for detailed examples 3. **Implement Reconcilers**: + - `controllers/session_controller.go`: Main reconciliation logic - `controllers/hibernation_controller.go`: Auto-hibernation logic - `controllers/user_controller.go`: User PVC management 4. **Add Prometheus Metrics**: + - Active sessions gauge - Hibernation events counter - Resource usage metrics 5. **Build and Test**: + ```bash # Generate CRDs and code make manifests generate @@ -685,6 +727,7 @@ make docker-build IMG=your-registry/streamspace-controller:v0.1.0 ``` 6. **Deploy to Cluster**: + ```bash # Push image make docker-push IMG=your-registry/streamspace-controller:v0.1.0 @@ -696,12 +739,14 @@ make deploy IMG=your-registry/streamspace-controller:v0.1.0 ### Phase 2: API & UI Implementation (Future) **API Backend** (Go with Gin or Python with FastAPI): + - REST endpoints for session management - WebSocket proxy for KasmVNC connections - JWT authentication with OIDC - Kubernetes client for CRD operations **Web UI** (React + TypeScript): + - User dashboard (my sessions, catalog) - Admin panel (all sessions, users, templates) - Session viewer (iframe or new tab) @@ -723,6 +768,7 @@ make deploy IMG=your-registry/streamspace-controller:v0.1.0 **Main Branch**: `main` (protected) **Feature Branches**: + - Format: `claude/claude-md-` - Example: `claude/claude-md-mhy5zeq2njvrp3yh-01MfcP2sWxBRw6sTTyEGW5gg` - Always develop on feature branches, not main @@ -740,6 +786,7 @@ Follow conventional commit format: ``` **Types**: + - `feat`: New feature - `fix`: Bug fix - `docs`: Documentation changes @@ -749,6 +796,7 @@ Follow conventional commit format: - `ci`: CI/CD changes **Examples**: + ```bash feat(controller): implement session hibernation reconciler fix(crd): correct validation for resource limits @@ -765,6 +813,7 @@ test(controller): add session lifecycle integration tests 4. **Reference Issues**: Include issue numbers when applicable **Good Examples**: + ```bash git commit -m "Add hibernation controller for auto-scaling sessions @@ -775,6 +824,7 @@ Closes #42" ``` **Bad Examples** (avoid): + ```bash git commit -m "updates" git commit -m "fixed stuff" @@ -784,6 +834,7 @@ git commit -m "WIP" ### Git Operations **Pushing Changes**: + ```bash # Always push to feature branch with -u flag git push -u origin claude/claude-md- @@ -793,10 +844,12 @@ git push -u origin claude/claude-md- ``` **Network Retry Strategy**: + - If `git push` or `git fetch` fails due to network errors - Retry up to 4 times with exponential backoff (2s, 4s, 8s, 16s) **Pull Requests**: + - Create PRs from feature branch to main - Use PR template (see `CONTRIBUTING.md`) - Request review from maintainers @@ -809,17 +862,20 @@ git push -u origin claude/claude-md- ### Unit Tests **Controller Tests**: + ```bash cd controller make test ``` **Test Structure**: + - Place tests in `*_test.go` files next to source - Use `ginkgo` and `gomega` for BDD-style tests - Mock Kubernetes client with `envtest` **Example Test**: + ```go var _ = Describe("Session Controller", func() { Context("When creating a new Session", func() { @@ -835,11 +891,13 @@ var _ = Describe("Session Controller", func() { **Location**: `tests/` directory (to be created) **Run Integration Tests**: + ```bash ./scripts/run-integration-tests.sh ``` **Test Scenarios**: + - Session creation and lifecycle - Hibernation and wake flows - Resource quota enforcement @@ -848,6 +906,7 @@ var _ = Describe("Session Controller", func() { ### Manual Testing **Deploy to Test Cluster**: + ```bash # Create test namespace kubectl create namespace streamspace-dev @@ -919,6 +978,7 @@ kubectl get templates -n streamspace ### Deploy Platform (Full Installation) **Option 1: Manual Deployment**: + ```bash # 1. Create namespace kubectl apply -f manifests/config/namespace.yaml @@ -945,6 +1005,7 @@ kubectl apply -f manifests/monitoring/ ``` **Option 2: Helm Deployment** (Recommended): + ```bash # Install from local chart helm install streamspace ./chart -n streamspace --create-namespace @@ -963,10 +1024,12 @@ helm uninstall streamspace -n streamspace ### Configuration **Key Configuration Files**: + - `chart/values.yaml`: Helm chart defaults - `manifests/config/controller-configmap.yaml`: Controller settings **Important Settings**: + ```yaml # Hibernation hibernation: @@ -1001,6 +1064,7 @@ networking: **Style Guide**: Follow [Effective Go](https://golang.org/doc/effective_go.html) **Formatting**: + ```bash # Format code gofmt -w . @@ -1010,12 +1074,14 @@ golangci-lint run ``` **Naming Conventions**: + - Types: PascalCase (`SessionReconciler`, `UserManager`) - Functions: camelCase (`reconcileSession`, `ensureUserPVC`) - Constants: UPPER_SNAKE_CASE or PascalCase for exported - Packages: lowercase, single word (`controllers`, `metrics`) **Error Handling**: + ```go // Always handle errors explicitly if err := r.Create(ctx, deployment); err != nil { @@ -1028,6 +1094,7 @@ return fmt.Errorf("failed to get template %s: %w", templateName, err) ``` **Comments**: + ```go // SessionReconciler reconciles a Session object and manages // the lifecycle of workspace pods, services, and PVCs. @@ -1047,11 +1114,13 @@ func (r *SessionReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ct ### YAML (Kubernetes Manifests) **Formatting**: + - Indent: 2 spaces - Use `---` separator between resources in same file - Order fields: apiVersion, kind, metadata, spec, status **Labels**: + ```yaml metadata: labels: @@ -1065,6 +1134,7 @@ metadata: ``` **Annotations**: + ```yaml metadata: annotations: @@ -1074,6 +1144,7 @@ metadata: ``` **Resource Naming**: + - Sessions: `{username}-{template}` (e.g., `user1-firefox`) - Pods: `ss-{username}-{template}-{hash}` (e.g., `ss-user1-firefox-abc123`) - Services: `ss-{username}-{template}-svc` @@ -1082,11 +1153,13 @@ metadata: ### Documentation **Code Comments**: + - Public APIs must have godoc comments - Complex logic should have inline comments explaining "why" - Use TODO/FIXME/NOTE markers with issue references **Markdown Files**: + - Use ATX-style headers (`#` not `===`) - Include table of contents for long documents - Use code blocks with language tags @@ -1099,12 +1172,14 @@ metadata: ### Working with CRDs **Install CRDs**: + ```bash kubectl apply -f manifests/crds/session.yaml kubectl apply -f manifests/crds/template.yaml ``` **Update CRDs** (after modifying in controller): + ```bash cd controller make manifests # Generate updated CRDs @@ -1112,6 +1187,7 @@ kubectl apply -f config/crd/bases/ ``` **View CRD Definition**: + ```bash kubectl get crd sessions.stream.space -o yaml kubectl explain session.spec @@ -1121,6 +1197,7 @@ kubectl explain session.status ### Working with Sessions **Create a Session**: + ```bash kubectl apply -f - <&1 | tee controller.log ``` **Debug Controller**: + ```bash # Enable debug logging export LOG_LEVEL=debug @@ -1225,6 +1312,7 @@ dlv debug ./cmd/main.go ### Monitoring **View Prometheus Metrics**: + ```bash # Port forward to controller kubectl port-forward -n streamspace deploy/streamspace-controller 8080:8080 @@ -1234,6 +1322,7 @@ curl http://localhost:8080/metrics | grep streamspace ``` **Access Grafana**: + ```bash kubectl port-forward -n observability svc/grafana 3000:80 @@ -1242,6 +1331,7 @@ kubectl port-forward -n observability svc/grafana 3000:80 ``` **View Alerts**: + ```bash kubectl get prometheusrules -n streamspace kubectl describe prometheusrule streamspace-alerts -n streamspace @@ -1262,6 +1352,7 @@ kubectl describe prometheusrule streamspace-alerts -n streamspace ### Current State **What Exists**: + - ✅ Complete architecture documentation (`docs/ARCHITECTURE.md`) - ✅ Controller implementation guide (`docs/CONTROLLER_GUIDE.md`) - ✅ Plugin development guide (`PLUGIN_DEVELOPMENT.md`) @@ -1275,6 +1366,7 @@ kubectl describe prometheusrule streamspace-alerts -n streamspace - ✅ Comprehensive README and CONTRIBUTING guides **Implementation Status**: + - ✅ Go controller using Kubebuilder (Phase 1 - Complete) - ✅ API backend with REST/WebSocket (Phase 2 - Complete) - ✅ React web UI with admin panel (Phase 4 - Complete) @@ -1287,6 +1379,7 @@ kubectl describe prometheusrule streamspace-alerts -n streamspace - ✅ Helm chart for deployment (Phase 5 - Complete) **What's Complete** (Phases 1-5): + - ✅ **Controller**: Session lifecycle, hibernation, user PVC management - ✅ **API Backend**: 70+ handlers, authentication (Local/SAML/OIDC), webhooks, integrations - ✅ **Web UI**: 50+ components, 14 user pages, 12 admin pages @@ -1305,6 +1398,7 @@ kubectl describe prometheusrule streamspace-alerts -n streamspace - ✅ **Documentation**: Complete user/admin/developer guides **What Remains** (Future Enhancements - Phase 6+): + - ⏳ VNC migration from LinuxServer.io to StreamSpace-native images (TigerVNC + noVNC) - ⏳ Multi-cluster federation - ⏳ WebRTC-based streaming (lower latency alternative) @@ -1346,12 +1440,14 @@ kubectl describe prometheusrule streamspace-alerts -n streamspace ### Common Misconceptions to Avoid **⚠️ Critical - Independence Strategy**: + - ❌ **Don't** introduce new KasmVNC references - use generic VNC - ❌ **Don't** hardcode Kasm-specific features - keep VNC-agnostic - ❌ **Don't** use `kasmvnc:` field name - use `vnc:` instead - ❌ **Don't** assume KasmVNC will remain - code for TigerVNC migration **Architecture Patterns**: + - ❌ **Don't** use StatefulSets - use Deployments with replicas field - ❌ **Don't** delete pods for hibernation - scale Deployment to 0 - ❌ **Don't** create per-session PVCs - use shared user PVC @@ -1392,6 +1488,7 @@ When helping with specific tasks, reference these files: ### CRD Issues **Problem**: CRD not found + ```bash # Solution: Install CRDs kubectl apply -f manifests/crds/session.yaml @@ -1402,6 +1499,7 @@ kubectl get crds | grep stream.space ``` **Problem**: CRD validation errors + ```bash # Solution: Check CRD schema kubectl explain session.spec @@ -1414,6 +1512,7 @@ kubectl apply -f manifests/crds/session.yaml ### Session Issues **Problem**: Session stuck in Pending phase + ```bash # Check session status kubectl describe session -n streamspace @@ -1429,6 +1528,7 @@ kubectl get events -n streamspace --sort-by=.metadata.creationTimestamp ``` **Problem**: Session pod not starting + ```bash # Check pod details kubectl describe pod -n streamspace @@ -1443,6 +1543,7 @@ kubectl logs -n streamspace ``` **Problem**: Hibernation not working + ```bash # Verify hibernation is enabled kubectl get cm -n streamspace streamspace-config -o yaml | grep hibernation @@ -1457,6 +1558,7 @@ kubectl logs -n streamspace deploy/streamspace-controller -f | grep -i hibernati ### Template Issues **Problem**: Template not found + ```bash # List available templates kubectl get templates -n streamspace @@ -1469,6 +1571,7 @@ kubectl get template firefox-browser -n streamspace ``` **Problem**: Template image pull failures + ```bash # Test image manually docker pull lscr.io/linuxserver/firefox:latest @@ -1484,6 +1587,7 @@ kubectl edit template firefox-browser -n streamspace ### Controller Issues **Problem**: Controller not starting + ```bash # Check controller deployment kubectl get deploy -n streamspace streamspace-controller @@ -1498,6 +1602,7 @@ kubectl logs -n streamspace deploy/streamspace-controller ``` **Problem**: Controller errors in logs + ```bash # Enable debug logging kubectl set env -n streamspace deploy/streamspace-controller LOG_LEVEL=debug @@ -1514,6 +1619,7 @@ kubectl logs -n streamspace deploy/streamspace-controller -f ### Storage Issues **Problem**: PVC stuck in Pending + ```bash # Check PVC status kubectl describe pvc home- -n streamspace @@ -1533,6 +1639,7 @@ kubectl get pods -n kube-system | grep nfs ### Network Issues **Problem**: Cannot access session URL + ```bash # Check ingress kubectl get ingress -n streamspace @@ -1551,6 +1658,7 @@ kubectl port-forward -n streamspace svc/ 3000:3000 ### Build Issues **Problem**: `make` commands fail in controller + ```bash # Install Kubebuilder curl -L -o kubebuilder https://go.kubebuilder.io/dl/latest/$(go env GOOS)/$(go env GOARCH) @@ -1568,6 +1676,7 @@ make manifests generate ``` **Problem**: Docker build fails + ```bash # Check Dockerfile exists ls -la Dockerfile @@ -1587,6 +1696,7 @@ docker system prune -a ## 📚 Additional Resources ### External Documentation + - [Kubernetes Documentation](https://kubernetes.io/docs/) - [Kubebuilder Book](https://book.kubebuilder.io/) - [LinuxServer.io Documentation](https://docs.linuxserver.io/) @@ -1594,6 +1704,7 @@ docker system prune -a - [Traefik Documentation](https://doc.traefik.io/traefik/) ### Internal Documentation + - `README.md`: User-facing project overview - `CONTRIBUTING.md`: Contribution guidelines and coding standards - `MIGRATION_SUMMARY.md`: Migration history and context @@ -1602,10 +1713,11 @@ docker system prune -a - `chart/README.md`: Helm installation instructions ### Community & Support + - **GitHub Issues**: Bug reports and feature requests - **GitHub Discussions**: Questions and community support - **Discord**: Real-time chat (link in README) -- **Documentation Site**: https://docs.streamspace.io (future) +- **Documentation Site**: (future) --- diff --git a/README.md b/README.md index 2d16b28a..48923e4b 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@ # StreamSpace -> **Stream any app to your browser** - An open source Kubernetes-native container streaming platform +> **Stream any app to your browser** - An open source platform-agnostic container streaming platform -StreamSpace is a Kubernetes-native platform that delivers browser-based access to containerized applications with on-demand auto-hibernation, persistent user storage, and enterprise-grade security features. +StreamSpace is a platform-agnostic platform that delivers browser-based access to containerized applications. It features a central Control Plane (API/WebUI) that manages distributed Controllers across various platforms (Kubernetes, Docker, Hyper-V, vCenter, etc.). [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Kubernetes](https://img.shields.io/badge/kubernetes-1.19+-blue.svg)](https://kubernetes.io/) @@ -37,6 +37,7 @@ StreamSpace is in active development with the core Kubernetes platform functiona ## Features ### Core Features + - Browser-based access to containerized applications via VNC - Multi-user support with isolated sessions - Persistent home directories (NFS) @@ -46,6 +47,7 @@ StreamSpace is in active development with the core Kubernetes platform functiona - Monitoring with Prometheus and Grafana ### Enterprise Features + - Authentication: Local, SAML 2.0 (Okta, Azure AD, Authentik, Keycloak, Auth0), OIDC OAuth2 - Multi-factor authentication with TOTP - IP whitelisting and rate limiting @@ -113,20 +115,20 @@ kubectl create secret generic streamspace-secrets \ │ REST API + WebSocket ↓ ┌─────────────────────────────────────────────────┐ -│ API Backend (Go/Gin) │ -│ Session CRUD, Auth, Plugins, Repository Sync │ +│ Control Plane (API) │ +│ Session CRUD, Auth, Plugins, Controller Mgmt │ └──────────────────────┬──────────────────────────┘ - │ Kubernetes API + │ Secure Protocol ↓ ┌─────────────────────────────────────────────────┐ -│ Kubernetes Controller (Go) │ -│ Session Lifecycle, Auto-Hibernation │ +│ StreamSpace Controllers │ +│ (Kubernetes, Docker, Hyper-V, etc.) │ └──────────────────────┬──────────────────────────┘ │ ↓ ┌─────────────────────────────────────────────────┐ -│ Kubernetes Cluster │ -│ Sessions (Pods), PVCs, Services, Ingress │ +│ Target Infrastructure │ +│ Sessions (Pods/Containers/VMs) │ └─────────────────────────────────────────────────┘ ``` @@ -176,17 +178,20 @@ Current test coverage is approximately 15-20%. See `tests/reports/TEST_COVERAGE_ ## Documentation ### Essential Docs + - [FEATURES.md](FEATURES.md) - Feature list with implementation status - [ROADMAP.md](ROADMAP.md) - Development roadmap and next steps - [CLAUDE.md](CLAUDE.md) - AI assistant guide for the codebase ### Technical Guides + - [Architecture](docs/ARCHITECTURE.md) - System architecture - [Controller Guide](docs/CONTROLLER_GUIDE.md) - Controller implementation - [Plugin Development](PLUGIN_DEVELOPMENT.md) - Building plugins - [API Reference](api/API_REFERENCE.md) - REST API documentation ### Deployment + - [Deployment Guide](DEPLOYMENT.md) - Production deployment - [Security](SECURITY.md) - Security policy @@ -237,9 +242,9 @@ StreamSpace is licensed under the MIT License. See [LICENSE](LICENSE) for detail ## Links -- **GitHub**: https://github.com/JoshuaAFerguson/streamspace -- **Templates**: https://github.com/JoshuaAFerguson/streamspace-templates -- **Plugins**: https://github.com/JoshuaAFerguson/streamspace-plugins +- **GitHub**: +- **Templates**: +- **Plugins**: --- diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index bd950280..a9e62393 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -20,93 +20,59 @@ StreamSpace is a Kubernetes-native multi-user platform that streams containerize │ HTTPS ↓ ┌──────────────────────────────────────────────────────────────┐ -│ Ingress (Traefik) │ -│ - TLS termination │ -│ - ForwardAuth (Authentik SSO) │ -│ - Dynamic routing per session │ +│ Ingress / Load Balancer │ └────────────────────────┬─────────────────────────────────────┘ │ ┌──────────────┴─────────────┐ ↓ ↓ ┌─────────────────────┐ ┌──────────────────────┐ -│ Web UI (React) │ │ API Backend (Go) │ +│ Web UI (React) │ │ Control Plane (API)│ │ - Dashboard │ │ - REST API │ │ - Catalog │ │ - WebSocket │ │ - Session viewer │ │ - Auth middleware │ -│ - Admin panel │ │ - K8s client │ +│ - Admin panel │ │ - Controller Mgmt │ └─────────────────────┘ └──────────┬───────────┘ - │ + │ Secure Protocol (gRPC/WS) ┌──────────────┴──────────────┐ ↓ ↓ -┌──────────────────────────────────────┐ ┌─────────────────┐ -│ StreamSpace Controller (Go) │ │ PostgreSQL │ -│ ┌────────────────────────────────┐ │ │ - Sessions │ -│ │ Session Reconciler │ │ │ - Users │ -│ │ - Create/Update/Delete pods │ │ │ - Templates │ -│ │ - Status tracking │ │ │ - Audit logs │ -│ └────────────────────────────────┘ │ └─────────────────┘ -│ ┌────────────────────────────────┐ │ -│ │ Hibernation Controller │ │ -│ │ - Idle detection │ │ -│ │ - Scale to zero │ │ -│ │ - Wake on access │ │ -│ └────────────────────────────────┘ │ -│ ┌────────────────────────────────┐ │ -│ │ User Manager │ │ -│ │ - PVC provisioning │ │ -│ │ - Quota enforcement │ │ -│ └────────────────────────────────┘ │ -└────────────────┬─────────────────────┘ - │ Kubernetes API - ↓ -┌──────────────────────────────────────────────────────────────┐ -│ Kubernetes Cluster │ -│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ -│ │ Session Pod │ │ Session Pod │ │ Session Pod │ │ -│ │ ┌──────────┐ │ │ ┌──────────┐ │ │ ┌──────────┐ │ │ -│ │ │Container │ │ │ │Container │ │ │ │Container │ │ │ -│ │ │(Firefox) │ │ │ │(VS Code) │ │ │ │(Blender) │ │ │ -│ │ │+ KasmVNC │ │ │ │+ KasmVNC │ │ │ │+ KasmVNC │ │ │ -│ │ └──────────┘ │ │ └──────────┘ │ │ └──────────┘ │ │ -│ │ ↓ │ │ ↓ │ │ ↓ │ │ -│ │ /home/user1 │ │ /home/user2 │ │ /home/user1 │ │ -│ │ (NFS PVC) │ │ (NFS PVC) │ │ (NFS PVC) │ │ -│ └──────────────┘ └──────────────┘ └──────────────┘ │ -└──────────────────────────────────────────────────────────────┘ - ↑ - │ NFS Protocol - ↓ -┌──────────────────────────────────────────────────────────────┐ -│ NFS Server (Persistent User Homes) │ -│ /export/home/user1, /export/home/user2, /export/home/user3 │ -└──────────────────────────────────────────────────────────────┘ +┌──────────────────────────────────────┐ ┌──────────────────────────────────────┐ +│ Kubernetes Controller (Agent) │ │ Docker Controller (Agent) │ +│ - Runs on K8s Cluster │ │ - Runs on Docker Host │ +│ - Manages Pods/PVCs │ │ - Manages Containers/Volumes │ +│ - Reports Status │ │ - Reports Status │ +└────────────────┬─────────────────────┘ └────────────────┬─────────────────────┘ + │ │ + ↓ ↓ +┌──────────────────────────────────────┐ ┌──────────────────────────────────────┐ +│ Kubernetes Cluster │ │ Docker Host │ +│ [Session Pods] │ │ [Session Containers] │ +└──────────────────────────────────────┘ └──────────────────────────────────────┘ ``` ## Core Components -### 1. StreamSpace Controller +### 1. StreamSpace Controllers (Agents) -**Language**: Go with Kubebuilder framework -**Purpose**: Manages session lifecycle and resource provisioning +**Architecture**: Agent-based model similar to Portainer Agents. +**Purpose**: Platform-specific implementation of session management. **Responsibilities**: -- Watch for Session CRD changes -- Provision pods, services, PVCs based on templates -- Update session status (phase, URL, resource usage) -- Enforce user quotas -- Handle state transitions (running → hibernated → terminated) - -**Key Reconcilers**: -- `SessionReconciler`: Main reconciliation loop -- `HibernationReconciler`: Idle detection and scale-to-zero -- `UserReconciler`: User management and PVC provisioning - -**Metrics Exposed**: -- `streamspace_active_sessions_total` -- `streamspace_hibernated_sessions_total` -- `streamspace_session_starts_total` -- `streamspace_hibernation_duration_seconds` -- `streamspace_resource_usage_bytes` + +- **Control**: Execute commands from Control Plane (Start, Stop, Hibernate). +- **Monitor**: Collect metrics (CPU, Memory, Network) and report to Control Plane. +- **Log**: Stream logs back to Control Plane. +- **Report**: Periodic status updates (Heartbeat, Session State). + +**Controller Types**: + +- **Kubernetes Controller**: Manages Pods, PVCs, Services. +- **Docker Controller**: Manages Containers, Volumes, Networks. +- **Hyper-V/vCenter**: Manages VMs (Future). + +**Communication**: + +- Secure WebSocket or gRPC connection to Control Plane. +- Pull-based or Push-based command execution. ### 2. API Backend @@ -114,6 +80,7 @@ StreamSpace is a Kubernetes-native multi-user platform that streams containerize **Purpose**: REST/WebSocket API for UI and integrations **Endpoints**: + - `GET /api/v1/sessions` - List user sessions - `POST /api/v1/sessions` - Create session - `GET /api/v1/sessions/{id}` - Get session details @@ -124,11 +91,13 @@ StreamSpace is a Kubernetes-native multi-user platform that streams containerize - `WS /api/v1/sessions/{id}/connect` - WebSocket for KasmVNC proxy **Authentication**: + - OIDC via Authentik - JWT tokens (1-hour expiration) - Refresh token flow **Authorization**: + - Users: Own sessions only - Admins: All sessions + config @@ -138,6 +107,7 @@ StreamSpace is a Kubernetes-native multi-user platform that streams containerize **Purpose**: User-facing dashboard and admin panel **Pages**: + - `/login` - Authentik SSO login - `/dashboard` - My sessions (running, hibernated) - `/catalog` - Browse templates by category @@ -155,6 +125,7 @@ StreamSpace is a Kubernetes-native multi-user platform that streams containerize **Structure**: Single-container pod with user-specific labels **Pod Specification**: + ```yaml apiVersion: v1 kind: Pod @@ -194,6 +165,7 @@ spec: ``` **Networking**: + - Service per session: `ss-user1-firefox-svc` - Ingress rule: `user1-firefox.streamspace.local` → Service - KasmVNC port: 3000 (default) @@ -203,6 +175,7 @@ spec: **Backend**: NFS with ReadWriteMany support **PVC per User**: + ```yaml apiVersion: v1 kind: PersistentVolumeClaim @@ -220,6 +193,7 @@ spec: **Mount Path**: `/config` (LinuxServer.io convention) or `/home/kasm-user` **Benefits**: + - Files persist across sessions - Shared across all user's workspaces - Backed up independently @@ -229,6 +203,7 @@ spec: **Purpose**: Extensible architecture for adding custom functionality without modifying core code **Plugin Types**: + - **Extension**: Add new features and UI components - **Webhook**: React to system events (session created, user login, etc.) - **API Integration**: Connect to external services (Slack, GitHub, Jira) @@ -236,6 +211,7 @@ spec: - **CLI**: Add custom command-line tools **Database Schema**: + ```sql -- Plugin repositories (GitHub, GitLab, custom) CREATE TABLE repositories ( @@ -276,6 +252,7 @@ CREATE TABLE installed_plugins ( ``` **API Endpoints**: + - `GET /api/v1/plugins/catalog` - Browse available plugins - `POST /api/v1/plugins/install` - Install plugin - `GET /api/v1/plugins/installed` - List installed plugins @@ -285,6 +262,7 @@ CREATE TABLE installed_plugins ( - `DELETE /api/v1/plugins/{id}` - Uninstall plugin **UI Components**: + - **PluginCatalog** (`/plugins/catalog`) - Browse and install plugins with search, filters, ratings - **InstalledPlugins** (`/plugins/installed`) - Manage installed plugins with config editor - **Admin PluginManagement** (`/admin/plugins`) - System-wide plugin administration @@ -293,6 +271,7 @@ CREATE TABLE installed_plugins ( - **PluginConfigForm** - Schema-based form generator for plugin configuration **Security Features**: + - Permission system with risk levels (low/medium/high) - Sandbox execution environment - Configuration validation @@ -300,6 +279,7 @@ CREATE TABLE installed_plugins ( - User/admin approval workflows **Event System**: + ```javascript // Plugins can register handlers for these events: - session.created @@ -316,6 +296,7 @@ CREATE TABLE installed_plugins ( ``` **Documentation**: + - `PLUGIN_DEVELOPMENT.md` - Complete developer guide with examples - `docs/PLUGIN_API.md` - Comprehensive API reference @@ -324,6 +305,7 @@ CREATE TABLE installed_plugins ( ### Session Creation Flow 1. **User clicks "Launch" in UI** + ``` POST /api/v1/sessions { @@ -338,6 +320,7 @@ CREATE TABLE installed_plugins ( - Generate unique session name 3. **API creates Session CR** + ```yaml apiVersion: stream.space/v1alpha1 kind: Session @@ -367,6 +350,7 @@ CREATE TABLE installed_plugins ( - User home directory mounted 7. **Status update** + ```yaml status: phase: Running @@ -386,6 +370,7 @@ CREATE TABLE installed_plugins ( - `time.Now() - lastActivity > idleTimeout` (default 30m) 3. **Updates Session state** + ```yaml spec: state: hibernated @@ -399,11 +384,13 @@ CREATE TABLE installed_plugins ( 5. **User returns and clicks session** 6. **API wake endpoint** + ``` POST /api/v1/sessions/{id}/wake ``` 7. **Updates Session state** + ```yaml spec: state: running @@ -484,6 +471,7 @@ spec: ### Authentication **SSO via Authentik**: + - OIDC provider - JWT tokens (access + refresh) - MFA support @@ -492,11 +480,13 @@ spec: ### Authorization **RBAC**: + - Users can only access their own sessions - Admins can access all sessions - Service accounts for automation **Network Policies**: + ```yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy @@ -535,11 +525,13 @@ spec: ### Memory Allocation **Cluster**: 64GB total (4 × 16GB nodes) + - System overhead: 8GB - StreamSpace platform: 4GB - **Available for sessions**: 52GB **Per-Session Estimates**: + - Browsers: 2GB - IDEs: 4GB - 3D/Video: 6-8GB @@ -573,6 +565,7 @@ func (r *SessionReconciler) enforceQuota(user string) error { ### Metrics **Controller Metrics**: + - Active sessions count - Hibernated sessions count - Session start/end events @@ -581,6 +574,7 @@ func (r *SessionReconciler) enforceQuota(user string) error { - Cluster capacity % **API Metrics**: + - Request rate - Error rate - Response time (p50, p95, p99) @@ -589,6 +583,7 @@ func (r *SessionReconciler) enforceQuota(user string) error { ### Dashboards **Grafana "Session Overview"**: + - Active vs hibernated sessions - Memory usage (per session, total) - Session lifecycle events @@ -626,14 +621,17 @@ chart/ ### High Availability (Phase 5) **Controller HA**: + - 2+ replicas with leader election - Kubernetes lease for coordination **API HA**: + - 3+ replicas behind Service - Horizontal Pod Autoscaler **Database HA**: + - PostgreSQL with replication - Or cloud-managed (RDS, Cloud SQL) @@ -642,6 +640,7 @@ chart/ ### Session Provisioning **Target**: < 30 seconds from request to accessible + - Pod scheduling: 5-10s - Image pull (cached): 2-5s - Container start: 10-15s @@ -671,6 +670,7 @@ chart/ --- For implementation details, see: + - Controller: `docs/CONTROLLER_GUIDE.md` - API: `docs/API_REFERENCE.md` - Deployment: `docs/GETTING_STARTED.md` diff --git a/docs/CONTROLLER_SPEC.md b/docs/CONTROLLER_SPEC.md new file mode 100644 index 00000000..324dafbe --- /dev/null +++ b/docs/CONTROLLER_SPEC.md @@ -0,0 +1,101 @@ +# StreamSpace Controller Specification + +## Overview + +The StreamSpace Controller is a platform-specific agent that manages the lifecycle of Sessions and Templates on a target infrastructure. It acts as a bridge between the central Control Plane (API) and the underlying platform (Kubernetes, Docker, Hyper-V, etc.). + +## Architecture + +### Agent Model + +Controllers operate as **Agents**. They are installed on the target infrastructure and initiate an outbound connection to the Control Plane. This avoids the need for the Control Plane to have direct inbound access to the controllers, simplifying network configuration (firewalls, NAT). + +### Communication Protocol + +- **Transport**: Secure WebSocket (WSS) or gRPC over TLS. +- **Direction**: Outbound from Controller to Control Plane. +- **Authentication**: API Key or Mutual TLS (mTLS). + +## Responsibilities + +### 1. Control (Command Execution) + +The Controller must execute commands received from the Control Plane: + +- `StartSession(SessionSpec)`: Provision a new session. +- `StopSession(SessionID)`: Terminate a session. +- `HibernateSession(SessionID)`: Pause a session (release resources, keep state). +- `WakeSession(SessionID)`: Resume a hibernated session. + +### 2. Monitor (Telemetry) + +The Controller must collect and report metrics: + +- **Node Metrics**: CPU, Memory, Disk usage of the host/node. +- **Session Metrics**: CPU, Memory usage of individual sessions. +- **Status**: Health status of the controller and the platform. + +### 3. Log (Stream) + +The Controller must provide access to session logs: + +- Stream container/VM logs back to the Control Plane on demand. + +### 4. Report (State Sync) + +The Controller must periodically sync the state of all managed resources: + +- List of active sessions. +- Status of each session (Running, Hibernated, Failed). +- Public endpoints (URLs) for accessing sessions. + +## Data Models + +### Session Spec (Abstract) + +The Control Plane sends a platform-agnostic `SessionSpec`: + +```json +{ + "id": "session-123", + "user": "user-abc", + "template": { + "image": "lscr.io/linuxserver/firefox:latest", + "env": {"PUID": "1000"}, + "ports": [{"name": "vnc", "port": 3000}] + }, + "resources": { + "cpu": "1000m", + "memory": "2Gi" + } +} +``` + +### Platform Translation + +The Controller translates this spec into platform-specific resources: + +- **Kubernetes**: Pod, Service, Ingress, PVC. +- **Docker**: Container, Network, Volume. +- **Hyper-V**: VM, VSwitch, VHDX. + +## Implementation Guidelines + +### Language + +Go is recommended for all controllers to share common libraries (e.g., communication with Control Plane). + +### Common Library (`streamspace-agent-sdk`) + +A common SDK should be created to handle: + +- WebSocket/gRPC connection management. +- Authentication handshake. +- Command dispatching. +- Metric collection primitives. + +## Future Controllers + +- **Docker Controller**: For single-node deployments (home labs). +- **vCenter Controller**: For enterprise VM-based environments. +- **LXD Controller**: For lightweight system containers.