π€ Your AI SRE that never sleeps. CloudSage transforms noisy logs into actionable risk forecasts.
CloudSage transforms noisy logs into actionable risk forecasts with AI-powered pattern learning, giving solo engineers an SRE assistant that predicts tomorrow's problems today.
Solo engineers and tiny teams run production systems without SRE expertise or monitoring budgets. They face:
- Growing error rates with no clear risk assessment
- Noisy logs that take hours to analyze
- Reactive firefighting instead of proactive prevention
- No early warning system for cascading failures
Result: 2am pages, burnt-out founders, and preventable downtime.
CloudSage is an AI-powered ops assistant that:
- Ingests logs (paste, upload, or sample data)
- Calculates risk score (0-100) via Vultr cloud compute
- Generates AI forecast with SmartInference chain branching
- Recommends 3 concrete actions (specific times, metrics, fixes)
- Learns patterns with SmartMemory to improve over time
- Forecasting, not just alerting - Predicts tomorrow's risk, not just today's errors
- Context-aware AI chains - Chooses emergency, preventive, or standard analysis based on your situation
- Transparent reasoning - Shows exactly why it made each recommendation
- Pattern learning - Gets smarter about your stack with every forecast
- Production-ready - Auth, payments, tests, deployed live
SmartBuckets - Intelligent log storage
- Hierarchical key structure (
projectId/timestamp/logId) - Forecast caching with 24-hour TTL
- Context sampling for historical analysis (last 10 logs)
- Native API with MCP fallback for resilience
SmartSQL - Production-grade analytics
- 3 tables: users, projects, risk_history
- Complex trend analysis (7-day rolling averages, slope calculations)
- Proper indexes and parameter interpolation
- Transaction safety with graceful fallbacks
SmartMemory - AI that learns
- User preferences (alert thresholds, ignored patterns)
- Action completion tracking (learns which actions you prioritize)
- Project baselines (30-score rolling averages for anomaly detection)
- Cross-session persistence for continuous improvement
SmartInference - Advanced AI orchestration
- Chain branching - Dynamically selects AI path:
- π¨ Critical (score β₯70): Emergency response chain
- π Preventive (rising + score >50): 7-day lookahead
- β Standard: Normal forecast generation
- Multi-step chains visible in UI (transparent AI reasoning)
- Confidence scoring based on data quality
- Heuristic fallback for resilience
Vultr Cloud Compute - Risk scoring worker
- Custom Node.js service analyzing 5 risk factors:
- Error rate (40 points max)
- Log volume (25 points)
- Latency indicators (20 points)
- Memory pressure (10 points)
- CPU usage (5 points)
- 142ms average latency (visible in UI)
- Health monitoring with automatic retries
- Fallback to local calculation if unavailable
Proof of integration:
- Live status badge showing latency + timestamp
- Infrastructure as code (Terraform in
/infra/vultr/) - Retry logic with exponential backoff
- Used on every log ingestion (real, not decorative)
WorkOS AuthKit - Enterprise-ready authentication
- Email magic links, OAuth (Google/Microsoft), SAML/OIDC
- Passwordless account detection
- MFA and directory sync ready
- 1M MAU free tier
Stripe - Payment processing
- Checkout flow with JIT user provisioning
- Pro plan ($29/month) for paid features
- Webhook infrastructure for subscription management
Security & Reliability
- Authorization guards (
ensureProjectAccessmiddleware) - Rate limiting (100 req/min ingest, 60 req/min forecast)
- Friendly error messages (SQL constraints β user-facing)
- Unit test coverage (5 core tests, integration suite)
- Frontend: https://steady-melomakarona-42c054.netlify.app
- Backend API: https://cloudsage-api.01kbv4q1d3d0twvhykd210v58w.lmapp.run/api
- GitHub: https://github.com/prabhakaran-jm/cloudsage-ai-ops-oracle
- Demo Video: https://www.youtube.com/watch?v=L55Yf8C7uY0
Try it now: Click "Load Sample Logs" for instant demo (no setup required)
Who needs this:
- Solo founders running production apps (500-100k users)
- Indie hackers with paying customers
- Startup CTOs with 2-5 person engineering teams
- Freelance developers managing client infrastructure
Measurable impact:
- Time saved: 2-4 hours/week on log analysis
- Incidents prevented: 6-24 hour early warning
- Stress reduced: Sleep through the night
- Cost saved: No $200/hr SRE consultants needed
β All 4 SmartComponents used meaningfully (not decorative) β SmartInference chain branching (shows AI orchestration mastery) β SmartMemory pattern learning (continuous improvement) β SmartSQL complex analytics (trend slopes, aggregations) β SmartBuckets context sampling (programmatic data access) β 1,360+ lines of Raindrop integration code
β Custom worker service (not just API calls) β Real-time latency visible in UI (proof of integration) β Infrastructure as code (Terraform) β Retry logic + fallback for resilience β Used on every request (real compute, not decorative)
β Deployed live (Netlify + Raindrop + Vultr) β WorkOS auth + Stripe payments β Unit tests + integration suite β Rate limiting + authorization guards β Error handling + loading states β Comprehensive documentation
- Real-time log streaming with Vultr Valkey (Redis-compatible)
- Slack/Discord notifications for critical alerts
- Multi-project dashboards and team management
- Auto-remediation suggestions (AI-generated code fixes)
- Mobile app for on-the-go monitoring
Raindrop:
- SmartInference chain transparency made debugging easy
- SmartMemory pattern learning was intuitive to implement
- SmartSQL's flexibility (native + MCP fallback) saved us
- Documentation was comprehensive
Vultr:
- Cloud Compute spin-up was instant
- Pricing is competitive for hackathon/startup budgets
- Terraform support made IaC easy
Raindrop:
- More examples of SmartBuckets AI search (we implemented sampling instead)
- Chain branching examples in docs would help
- SmartMemory with vector embeddings for semantic search
Vultr:
- Valkey (Redis-compatible) setup guide would be helpful
- Global latency optimization patterns
- More managed service examples (Kafka, databases)
- Solo developer: Built for The AI Champion Ship 2025
- Development time: 7 days (Dec 5-12, 2025)
- AI assistance: Claude Code (codebase architecture), Gemini CLI (debugging)
- Lines of code: ~4,500 (TypeScript + React)
- Track: Best Small Startup Agents
- Node.js 18+ and npm
- Raindrop account and API key
- Vultr account (for worker deployment)
- WorkOS account (for authentication)
- Stripe account (optional, for payments)
Raindrop (Backend):
RAINDROP_API_KEY- Your Raindrop API keyRAINDROP_MCP_URL- Raindrop MCP server URL (default: http://localhost:3002)VULTR_WORKER_URL- Your Vultr worker URLVULTR_API_KEY- Vultr worker API keyJWT_SECRET- Secret for JWT token signing
WorkOS (Authentication):
WORKOS_CLIENT_ID- Your WorkOS client IDWORKOS_API_KEY- Your WorkOS API keyWORKOS_REDIRECT_URI- Callback URLWORKOS_COOKIE_PASSWORD- At least 32 characters for session encryption
Stripe (Payments - Optional):
STRIPE_SECRET_KEY- Your Stripe secret keySTRIPE_PUBLISHABLE_KEY- Your Stripe publishable keySTRIPE_WEBHOOK_SECRET- Webhook signing secretNEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY- Same as publishable key (for frontend)NEXT_PUBLIC_STRIPE_PRO_PRICE_ID- Stripe Price ID for Pro plan
# Install dependencies
npm install
# Run frontend
npm run dev:web
# Run backend API
npm run dev:api
# Run Vultr worker (local)
npm run dev:vultr-workerSee infra/deploy/DEPLOYMENT_GUIDE.md for complete deployment instructions.
Quick deploy:
# Deploy Vultr worker
./infra/deploy/deploy-vultr-worker.sh
# Deploy backend to Raindrop
raindrop build deploy --start --amend
# Deploy frontend to Netlify
cd apps/web && netlify deploy --prod- ARCHITECTURE.md - System architecture and component design
- AI_ASSISTANT_USAGE.md - How Claude Code and Gemini CLI were used in development
- DEPLOYMENT_GUIDE.md - Complete deployment instructions
- Infrastructure README - Infrastructure overview and setup
- License: MIT
- GitHub: https://github.com/prabhakaran-jm/cloudsage-ai-ops-oracle
- Demo: https://steady-melomakarona-42c054.netlify.app
Built with β€οΈ for The AI Champion Ship 2025 using LiquidMetal AI(Raindrop), Vultr, WorkOS, Stripe, Netlify, Cloudflare.